On this page:
Creating and managing a collection includes both the mechanical steps of creating a collection and adding content in Archive-It, as well as the decision-making process of what the collection should be about, what should go in it, and how frequently it should be collected. To this end, developing a collection’s content is up to the individual or institution, and often involves connecting a mission with any relevant policies, such as a records retention or collection development policies, to set web archiving goals.
How this is accomplished, and its end results are varied, but there are resources and examples for guidance: https://communitywebs.archive-it.org/collection-development.html. Our white paper on the Web Archiving Life Cycle also delves into the process of collection development deeper, exploring the intersections between the technical abilities of crawling content, and the realities of balancing a data budget, with web archiving goals.
Create a new collection
Click the Create a Collection button on either your account's home page (shown below) or from the homepage of the Collections tab selected from the top black navigation bar in the web application. You will be prompted to give your new collection a name.
Once you create a collection, it will have its own dedicated management space in your account:
Add seeds to a new collection via the button on the Overview tab. You can add additional seeds to your collection at any time using the Seeds tab.
Seeds are starting point URLs where your crawls will begin, as well as an access point to archived pages. When adding seeds, if any of the seeds you add are malformed, you will have the chance to check them for spelling or other errors before the collection is crawled. You will also be given a warning if a seed appears in another collection in your account or has scoping rules automatically applied to it.
Manage collection settings
Navigate among the tabs across the top of the collection management space in order to manage your collection's settings, scope, and functions:
- Overview - Review summary data about your collection, including documents and data archived. Edit the collection name by clicking the pencil icon beside the name. Use the settings to adjust the visibility (public/private) and regular crawling status (active/inactive) of the collection.
- Seeds - Manage your seeds' settings, types, crawl frequencies, and metadata, either individually or in bulk. Access archived captures of your seeds using the Wayback links.
- Crawls - Review reports for current and completed crawls performed in this collection.
- Collection Scope - Modify your collection's crawl scope in order to refine how our crawler decides what does and does not belong in it.
- Metadata - Add, edit, or upload collection level metadata.
- Wayback QA - Manage documents and data discovered during the Quality Assurance (QA) process for this collection.