Scoping & Running Crawls
Scoping Your Crawls
- How Archive-It crawlers determine scope
- Modify your collection or seed scope
- How to add Seed level scoping rules to multiple seeds at once
- Limit your crawl
- Expand the scope of your crawl
- Robots.txt exclusions and how they can impact your web archives
Running Crawls
- How to run, monitor, and save a test crawl
- How to manually start test and one-time crawls
- How to crawl new seeds immediately with InstaCrawl
- How to schedule crawls
- How to add and use the Archive This! bookmarklet
- Crawling with a Custom User Agent
Managing Crawls
Scoping Recommendations for Specific Sites
- Scoping guidance for specific types of sites
- Archiving sites protected by Cloudflare
- Archiving ArcGIS
- Archiving Blogspot sites
- Archiving Facebook
- Archiving Flickr streams