Once you have added seeds to your collection, you may manage their settings, metadata, scope, and more, either individually or in bulk, from the collection's "Seeds" tab:
On this page:
- How to manage an individual seed
- How to manage seeds in bulk
- How to activate and deactivate seeds
- Editing existing seed URLs
How to manage an individual seed
Click on the hyperlinked URL for any individual seed in the Seed URL column in order to see its management interface:
You can edit your seed URL by clicking on the pencil icon to the right of the seed URL, making any changes, and then clicking the check mark that appears below the URL.
Navigate among the tabs across the top of this space to manage the following features:
- Settings: Activate or deactivate the seed for regular crawling, change its public access options, its crawl frequency, seed type, group, and/or login username/password information.
- Metadata: Add metadata for this specific seed using the Dublin Core element set and/or custom fields.
- Crawling History: View a list of all crawls that you have run that include this seed.
- Notes: Record internal notes pertaining to this seed. (Use the search box at the top of the Seeds tab to filter seeds by these notes as well as other settings).
- Seed Scope: Refine the scope of your crawls by telling our crawler what to archive (and what not to archive) by setting custom rules at the seed level.
How to manage seeds in bulk
You may manage the settings and functions of several seeds at once by clicking the check boxes next to each:
Once you have selected your seeds, you may use the buttons at the top of the list in order to:
- Run Crawl: Manually start a test or one-time crawl of your selected seeds. You will be prompted to define a document, data and/or time limit for your new crawl.
- Edit Settings: Make bulk changes for the following settings: crawl frequency, active/inactive crawling status, public access, seed type and login username/password information.
- Add to Group: This feature groups seeds together for browsing on our public site, www.archive-it.org.
- Delete seeds: Remove seeds from being listed on the public site or within the web application. Note that deleting a seed does not delete data, which will still be accessible via full-text search.
How to activate and deactivate seeds
A seed is considered "active" when it is scheduled for crawling. If you no longer want a seed or collection to be scheduled for crawling, you can designate it as "inactive."
By default, all seeds are designated active upon being added to your collection. To deactivate a seed in your collection, click on its hyperlinked URL under the Seeds tab and un-check the check-box next the the phrase "Activation status":
To deactivate multiple seeds in your collection, check the check-boxes next to each under the Seeds tab, click the "Edit Settings" button, and toggle the radio button from "Active" to "Inactive":
Click the Save button and your seeds will become inactive, meaning that they will not be crawled on a regularly scheduled basis.
Editing existing seed URLs
Seed URL edits that do not affect Wayback calendar access:
- Editing the URL protocol from http/https
- Removing or adding the subdomain www
- Removing or adding the ending slash (/) from a URL string
Making any/all of the above edits to your seed URL won't change the access point to the Wayback calendar page. You will continue to see all of the same archives accessible by date on its calendar page as you did before. However, adding any other subdomain, subdirectory, or otherwise changing the URL would change the calendar page that the seed points to.
If a site's URL changes significantly, instead of editing the existing seed URL directly we recommend deactivating it and adding a new seed so that the access point to the old version is retained.