On this page:
Where to find your crawl reports
You can access all of your crawl reports at any time by way of the Crawls link in the top navigation bar of our web application. The landing page lists all reports for completed crawls under the "Crawl Reports" tab.
For a detailed walkthrough of your post-crawl reports, check out the videos in our Post Crawl Analysis series.
By default, these reports are listed by Crawl ID—a unique identifier displayed to the left of each crawl. You can organize them by the other headers in the table. The reports listings can be filtered by any of the column fields by using the search bar. You can also add searchable notes to an individual crawl reports' Overview page in order easily filter down to.
What's inside the report
Crawl Overview
By clicking on the Crawl ID or "View >>" link associated with any crawl in your list, you can access a high level summary of how that crawl was conducted. This "Crawl Overview" tab of the report includes the crawl's status—finished—and tells you whether it finished due to a limit on number of documents or data. It provides summary data on how much total content was crawled and how much if any new data was thereby added to your collection (to understand why crawled data might not be archived, see our explanation of de-duplication). It also records and represents any rules that may have been put in place for crawling—such as scope expansions, document limits, etc., in order to indicate why some new materials may have archived while others did not.
Seeds, Hosts, and File Types
Additionally, each of your crawl reports includes more information in the form of the following specialized reports on seeds, hosts, and file types:
How to read your crawl's seeds report
How to read your crawl's hosts report
How to read your crawl's file types report
Comments
0 comments
Please sign in to leave a comment.