[colug-432] Data mining
Steve VanSlyck
s.vanslyck at spamcop.net
Fri Dec 17 14:25:11 EST 2010
If you go to
http://franklincountyoh.metacama.com/do/searchByParcelId
then enter, for example, parcel no. 010-008436
you are taken to
http://franklincountyoh.metacama.com/do/searchByParcelId;jsessionid=2C5A034D6CE2B1FF816F095759EF5347
(which contains what I assume to be a session ID).
If you then click on the link labeled "Tax/Payment Info" you are taken to
http://franklincountyoh.metacama.com/do/selectDisplay?select=TAXINFO&curpid=01000843600
You can easily print that page to a PDF file if you like.
So all that works fine.
Yesterday, I was able to grab a large number of TAXINFO pages by
(a) creating an HTML page on my desktop with a link to each of the parcels
I was interested in, and
(b) created a PDF file using Acrobat to grab the page and one level down).
It didn't work at first, so I then
(a) opened the page, grabbed the session ID link,
(b) left the page open,
(c) pasted the link into my local HTML page,
(d) and then created the PDF file from Acrobat as before.
It worked like a dream. Well, last night anyway.
Today, however, it will not work at all. I've tried using the same session
ID, using a fresh session ID, opening Acrobat first, opening Firefox (or
IE first) and so on.
I don't know enough of what's going on under the hood, however, to recreate
my earlier success. I don't even know if the need for a session ID was
the problem or what I did (whatever it was) that made it go.
Are there any Internet experts here that can help? I really don't want to
download 200 pages manually.
More information about the colug-432
mailing list