Public access to PDF only crawls

Comments

1 comment

  • Official comment
    Avatar
    Linda at Archive-It

    Hi, Abigail. While PDF-only crawls do not collect the HTML of the seed page, the PDFs collected in saved test or production crawls are indexed and returned in search results. This means that if the seed from your PDF-only crawl is set to 'visible to public', the seed itself won't have any Wayback captures, but if (for instance) you search the page text by a phrase included in the PDF, the PDF will be returned in search results. You can click the result to view the PDF.

    If you want to limit your search results to PDFs only, go to Advanced Search > File format > PDF. In the sample search below, the PDF results were all collected from a one-time PDF-only crawl on July 30, 2024.

    Comment actions Permalink

Please sign in to leave a comment.