Archive-It can crawl, archive, and replay some password-protected websites or pages. This feature is compatible with traditional username/password authentication systems. It is incompatible with log-in processes that require two-step authentication, split username and password fields across webpages, password-only sites (without a username), sites that require a CAPTCHA, or sites with speciality certificates.
These seeds require login credentials to allow the crawler to access the protected content. To add login credentials, navigate to the "Seeds" tab, check the box next to any relevant seed to select it, and click the "Edit Settings" button:
In the "Edit Seed Settings" dialog box, add your login name and password, and then click "Apply" to save your changes.
When your seed is next crawled, the crawler will use your login name and password credentials to access the site and archive the password-protected content. As with all of your other archives, you can restrict access to archived password protected content.


Comments
0 comments
Please sign in to leave a comment.