The Internet Archive's General Archive is a complimentary resource for the Archive-It community. Archive-It is a paid subscription service offered by the Internet Archive.
Archive-It enables our partner organizations to curate, scope, and manage their own focused or topical collections. These partners control how deep and how often a site is crawled; they can exclude content from being crawled, surpass robots exclusions, catalog with metadata at the collection, seed, and document level, and so on. Archive-It collections attribute archived web pages to a specific collection and the organization that captured it.
Full text search (basic and advanced) is available with Archive-It collections, and there are no plans currently to provide this for the General Archive.
The General Archive's crawls do not use Umbra, and so many social media sites (Flickr, Twitter, Instagram, Vimeo and Facebook etc) are not captured.
Archive-It provides technical support throughout the process in order to help our partners with scoping and other issues.
Archive-It partners may retrieve a back-up copy of their data at any time, which is not available for content collected as part of the General Archive.
By default, content that is captured through the Archive-It service and public on the Archive-It website also appears in the General Archive within 24 hours. However, all trial/training content and any content restricted specifically by our partners remains inaccessible in the General Archive.