Overview
This page provides an overview of your crawl's file types report, including where to find it and a detailed description of what's inside.
On this page:
Where to find your file types report
Each crawl report includes a "File Types" tab, located to the right of the Hosts tab, for data specific to each type of file archived during the course of a crawl:
What's inside the report
The file types report organizes and provides access to all URLs crawled by type. Graphics at the top of this report indicate the top file types encountered during your crawl and the amount of documents and data that they each represent. All file types crawled are listed, along with their respective document counts and data volumes, in the table below.
Files are distinguished by specifically by MIME type, meaning that you may use the search bar on this report to discover them by either their generic terminologies (image, application, video, etc.) or file extension (PDF, HTML, MP4, etc.). You can read more about MIME types in our glossary.
Clicking on the hyperlinks for either the MIME type or document count listed here will lead you to a detailed listing of each URL of that file type within your crawl:
From this subsequent list, you may access each URL individually in Wayback mode by either clicking directly on the document URL itself or by clicking "View." You may also assign URLs document level metadata by clicking "Metadata."
Related content
Reading your crawl's seeds report
Reading your crawl's hosts reports
Comments
0 comments
Please sign in to leave a comment.