Archive-It can crawl, archive, and replay many password-protected websites or pages. This feature is compatible with traditional username/password authentication systems. It is incompatible with log-in processes that require two-step authentication, split username and password fields across webpages, or require a CAPTCHA.
These seeds require login credentials to allow the crawler to access the protected content. To add them through our web application, navigate to the "Seeds" tab of your collection's management area, check the box next to any relevant seed to select it, and click the "Edit Settings" button:
In the ensuing "Edit Seed Settings" dialog box, you can add your login name and password, and click "Apply" to save these changes.
When that seed is next crawled, the crawler will use these credentials to access the site and archive the password protected content. As with all of your other archives, you can restrict access to archived password protected content.
Comments
0 comments
Please sign in to leave a comment.