What exactly is a seed?

A seed is an item with a unique numerical identifier in the Archive-It backend. Some information about a seed does not change, like the date it was added or updated and its crawl history. Seeds also have data you can edit like seed-level Metadata, notes, and even the seed URL.

Seeds can perform two important tasks in Archive-It. They tell the crawler where to go on the live web, and provide it with information on what to collect. They also point users to the archived version of their URL in Wayback. You can use seeds for one or both of these tasks. A seed does not need to appear in a public collection page for archived pages collected from it to be accessible. Similarly, you don't need to include a seed in a crawl for it to point to a page in Wayback if the URL is already archived.

How to format your seed URLs

Seed URLs can point the crawler to...

an entire website http://www.whitehouse.gov/
a specific directory of a website http://www.whitehouse.gov/issues/foreign-policy/
a specific document or file http://www.whitehouse.gov/sites/default/files/rss_viewer/national_security_strategy.pdf

Generally, a URL copied from a browser's address bar will have correct formatting. There are, however, important principles to remember before adding these URLs as seeds:

Do you need a / (slash) at the end of the URL? Archive-It's crawling technologies (Standard and Brozzler) handle the / at the end of URLs differently. Refer to the default scope article to determine whether a / is necessary for your use case.
Does the URL redirect to something else in your browser? Generally, we recommend using only as much of the URL as you need to end up on your target website. For example, the site http://myexamplewebsite.com automatically redirects to http://myexamplewebsite.com/home, you should use http://myexamplewebsite.com as your seed URL.
Does your URL have a # (hashtag)? Anything that comes after a # (hashtag) in a seed URL is ignored by crawlers, which could significantly change the scope of your crawl.

How to add new seeds to a collection

To add one or more new seeds to your collection, navigate to that collection's Seeds tab and click Add Seeds.

Screenshot 2026-07-09 at 10.13.34 AM.png

In the dialog box, enter your seed URL(s) one seed per line. You can add up to 1,000 lines. Then select your preferred visibility, crawl schedule, and Seed type.

Screenshot 2026-07-09 at 10.14.42 AM.png

To save, click Add Seeds. The new seed(s) will be listed in the Seeds tab of your collection.

Deleting seeds

You can delete seeds from a collection by selecting them from the Seeds list and clicking the Delete button.

Deleted seeds are removed from your collection's Seeds tab immediately and as an access point on your Archive-It.org collection page within 24 hours.

Deleting a seed does not delete any Wayback captures.

Articles in this section

Select and add Seed URLs

What exactly is a seed?

How to format your seed URLs

How to add new seeds to a collection

Deleting seeds

Comments

Articles in this section

What exactly is a seed?

How to format your seed URLs

How to add new seeds to a collection

Deleting seeds

Related articles