Web archives are organized into collections. This article describes how to create a new web archive collection within your Archive-It account, how to add seeds to that collection, and introduces collection-level settings that you may manage for any collection.
On this page:
Create a new collection
Click the "Create a Collection" button on either your account's home page (shown below) or from the Collections tab in the black navigation bar at the top:
You will be prompted to give your new collection a name. Enter a name, click the "Create" button, and you will be navigated to a dedicated collection management space in your account:
A seed is both a starting point for Archive-It's capture technology and an access point to archived pages for end users. Each seed in an Archive-It collection has a URL that tells capture technology where to find the desired content to archive on the live web, and represents to the end user where they may find that same content in the web archive collection.
Add your first seeds to a new collection via the button under the default Overview tab. You can add additional seeds to your collection at any time thereafter under the collection's Seeds tab.
Manage collection settings
Navigate among the tabs across the top of the collection management space in order to manage your collection's settings, scope, and functions:
Overview - Review summary data about your collection, including documents and data archived. Edit the collection name by clicking the pencil icon beside the name. Use the settings to adjust the visibility (public/private) and regular crawling status (active/inactive) of the collection.
Seeds - Manage your seeds' settings, types, crawl frequencies, and metadata, either individually or in bulk. Access archived captures of your seeds using the Wayback links.
Crawls - Review reports for current and completed crawls performed in this collection.
Collection Scope - Modify your collection's crawl scope in order to refine how our crawler decides what does and does not belong in it.
Metadata - Add, edit, or upload collection level metadata.
- Wayback QA - Manage documents and data discovered during the Quality Assurance (QA) process for this collection.
Once you have created your Archive-It collection and added your first seeds, you may run crawls, automate a crawl schedule, add descriptive metadata, and/or manage public access permissions for end users of the collection.
Creating and managing a collection includes both the mechanical steps of creating a collection and adding content in Archive-It, as well as the decision-making process of what the collection should be about, what should go in it, how frequently it should be collected, and how it fits into the overall data budget. To this end, developing a collection’s content is up to the individual or institution, and often involves connecting a mission with any relevant policies, such as a records retention or collection development policies, to set web archiving goals.
How this is accomplished, and its end results are varied, but there are resources and examples for guidance: https://communitywebs.archive-it.org/collection-development.html. Our white paper on the Web Archiving Life Cycle also delves into the process of collection development, exploring the intersections between the technical abilities of crawling content, and the realities of balancing a data budget, with web archiving goals.