Single Sign On Blocking Crawling / ArchiveIt
Hi all,
Our institution previously crawled and preserved an insitution webpage dedicated to schedule of classes. Our Archives used ArchiveIt to capture this(ese) pages as end of year processing. A department within our institution requested the schedule of classes be put behind Single Sign On, and thus is preventing ArchiveIt from crawling/capturing the page (it just preserves the Single Sign On page). We've tried inputting our staff's username/password in our settings for this seed, however it does not bypass the SSO.
We've reviewed this help page already: https://support.archive-it.org/hc/en-us/articles/10564321001748-Troubleshooting-password-protected-sites#:~:text=Related%20content-,About,or%20sites%20with%20speciality%20certificates.
...but we were wondering if anyone has had any progress on this topic since its publication?
-
Official comment
Hi Alicia,
Troubleshooting the crawling of websites that are password-protected can be tricky! While we strive to keep the advice in the Help Center more generalized, each password-protected site has its own unique challenges.
For this reason, and other security reasons, if that general advice in our Help Center hasn't worked out, it's best practice to submit a support ticket with the details of your organization's particular case, including crawl report links and Wayback URLs.Comment actions
Please sign in to leave a comment.
Comments
1 comment