[colug-432] Data mining

Steve VanSlyck s.vanslyck at spamcop.net
Fri Dec 17 14:25:11 EST 2010


If you go to

http://franklincountyoh.metacama.com/do/searchByParcelId

then enter, for example, parcel no. 010-008436

you are taken to

http://franklincountyoh.metacama.com/do/searchByParcelId;jsessionid=2C5A034D6CE2B1FF816F095759EF5347

(which contains what I assume to be a session ID).

If you then click on the link labeled "Tax/Payment Info" you are taken to

http://franklincountyoh.metacama.com/do/selectDisplay?select=TAXINFO&curpid=01000843600 
 You can easily print that page to a PDF file if you like.

So all that works fine.

Yesterday, I was able to grab a large number of TAXINFO pages by
(a) creating an HTML page on my desktop with a link to each of the parcels 
I was interested in, and
(b) created a PDF file using Acrobat to grab the page and one level down).

It didn't work at first, so I then

(a) opened the page, grabbed the session ID link,
(b) left the page open,
(c) pasted the link into my local HTML page,
(d) and then created the PDF file from Acrobat as before.

It worked like a dream. Well, last night anyway.

Today, however, it will not work at all. I've tried using the same session 
ID, using a fresh session ID, opening Acrobat first, opening Firefox (or 
IE first) and so on.

I don't know enough of what's going on under the hood, however, to recreate 
my earlier success. I don't even know if the need for a session ID was 
the problem or what I did (whatever it was) that made it go.

Are there any Internet experts here that can help? I really don't want to 
download 200 pages manually.




More information about the colug-432 mailing list