What are "Seed Types" and How do we use them?
A seed's type will determine which among a standard set configurations will define its crawling scope. You can use the following seed types to tell the crawler how much of a seed site to archive.
- Standard: Seed will be crawled and archived according to our default crawl scope. We recommend using this seed type for most websites.
- Standard Plus External Links (Standard+) - Seed will be archived according to our default crawl scope, but will also include content otherwise deemed "out of scope" if it is accessible by a direct link on an "in scope" page. Note: In previous web archiving programs and services, this configuration was also known as "One Hop Off."
- One Page - Only the first page of your seed will be archived. Links to other pages will not be crawled. We recommend using this seed type for things like newspaper articles, blog posts, Wikipedia entries, and any other pages that you wish to archive without archiving their entire contexts.
- One Page Plus External Links (One Page +) - The first page of your seed, as well as the first page of any URLs directly linked off of your seed, will be archived. We recommend using this seed type for things like news feeds or other pages that contain a list of links to pages on other domains that you would nevertheless like to capture Note: In previous Archive-It releases, this configuration was also known as "News/RSS Feed."
How to assign seed types
You can assign each new seed a type as you add it to a collection. If you do not choose any specific seed type, the Standard setting will be applied by default.
- Change the seed type of any individual seed within its management interface clicking on the seed itself. To change the seed type of a group of seeds, check the box next to the seeds that you would like to change and click the "Edit Settings" button:
- In the "Edit Seed Settings" dialog box, choose the seed type from the drop-down menu and click the "Save" button to effect your change: