Archive-it banner logo
  • Community
  • Submit a request
Sign in
  1. Archive-It Help Center
  2. Scoping & Running Crawls

Scoping & Running Crawls

Scoping Your Crawls

  • How Archive-It crawlers determine scope
  • Scope Rules and how to use them
  • Modify your collection or seed scope
  • How to add Seed level scope rules to multiple seeds at once
  • How to limit your crawl
  • Expand the scope of your crawl
See all 10 articles

Running Crawls

  • How to run, monitor, and save a test crawl
  • How to manually start test and one-time crawls
  • How to crawl new seeds immediately with InstaCrawl
  • How to schedule crawls
  • How to add and use the Archive This! bookmarklet
  • Crawling with a Custom User Agent

Crawling Technology

  • Archive-It Crawling Technology
  • What is Brozzler?
  • How and when to use Brozzler
  • About yt-dlp (youtube-dl)

Managing Crawls

  • How to select a time limit for your crawl
  • How to monitor currently running crawls
  • How to resume a finished crawl
  • How to find your crawl ID number
  • About data de-duplication

Scoping Recommendations for Specific Sites

  • Scoping guidance for specific types of sites
  • Archiving sites protected by Cloudflare
  • Archiving ArcGIS
  • Archiving Blogspot sites
  • Archiving Facebook
  • Archiving Flickr streams
See all 26 articles

FAQ: Crawling

  • What does HTTP error 61 mean and what can I do about it?
  • What are regular expressions and when should I use them?
  • How do I stop a seed from being crawled?
  • What is data-deduplication and how does it work?
  • How do I know how big a crawl will be?
Archive-It Help Center