Websites can be composed of elements from various locations. Archive-It's crawlers collect all embedded elements (images, video players, stylesheets, analytics, etc.), even if their host domain differs from the seed's. If there are some particularly odd ones, however, feel free to contact us, and we will investigate whether or not they present any problems.
Articles in this section
- What are these screenshot:, thumbnail:, and youtube-dl: hosts in my crawl report?
- Why doesn’t my Flash content work?
- Can I run Wayback QA or a patch crawl on a test capture?
- How can I block individual hosts within a domain from archiving?
- What are all these other hosts listed in my crawl's Hosts report?
- What is the difference between a seed and a host?
- Why does my crawl report tell me that URLs were blocked?
- What is the difference between all and new documents/data?
- What do all the messages in the Status column of my Seeds report mean?
- Why didn't some pages get archived?
Comments
0 comments
Please sign in to leave a comment.