Crawler refuses to collect?
I'm working on a project to collect local county codes, and a couple websites are confounding my attempts to do so.
For example, I'm trying to crawl https://codelibrary.amlegal.com/codes/annearundel/latest/. I have tried crawling this page with every seed type and with Standard and Brozzler. The crawl reads as having collected data and pages, but when I try to access the archives pages via Wayback, the site only defaults to https://codelibrary.amlegal.com/codes/annearundel/latest/overview.
Any thoughts on what I'm doing wrong? Or is the site just not friendly to Archive-It's crawling methods?
-
Right?!
But the links on the left menu just redirect to the /latest page. Same when I try to reduce the seed down to https://codelibrary.amlegal.com/codes/annearundel. The crawl redirects to /latest/, or I get a 404 error.
Please sign in to leave a comment.
Comments
4 comments