You can download a full list of documents captured in a given crawl's File Types report using the DownloadThemAll! Firefox Plugin.
To install the DownloadThemAll! Plugin, follow the steps here.
Downloading lists of Files
- The file types reports lists 100 documents on each page. To download more than 100 files, this instruction list must be repeated multiple times (once per 100 files).
- Currently it is not possible to differentiate between files that have been archived before, and those that are brand new. We hope to include this functionality in a future release.
- Access https://partner.archive-it.org/archiveit/login.html using Firefox.
- Enter your Archive-It partner username and password and click Login.
- Click the Crawls link at the top of the screen.
- Click the Crawl ID # or "View" associated to a crawl that contains captured documents
- Click "File Types" tab of the report, followed by the link to all files of the type you want to download.
6. Open DownloadThemAll! by clicking on Tools->DownloadThemAll! Tools->DownloadThemAll!...
7. The main DownloadThemAll! window will appear. The window contains a list of all Web resources from the report page that are available for download via DownloadThemAll! These include the files associated with the chosen crawl.
8. To select the archived files using Fast Filtering, click on Fast Filtering at the bottom of the window.
9. Enter https://wayback.archive-it.org to limit your download to only archived documents. If you are downloading PDF files the following string will work as well /^