Typically, the "Status" column in these reports will indicate that your seed was successfully "Crawled." Other possible messages and their meanings include:
- Error 403: The site owner has forbidden access to our crawler (this is different than being blocked by robots.txt). In this case, you must request that the site owner specifically enable access to their website by our crawler, which they will recognize by the name: archive.org_bot.
- Error 404: The seed URL wasn't found. This could be a a typo in the seed URL, a web server misconfiguration, or the page may simply not exist.
- Blocked (robots.txt): The site owner has blocked our crawler (user agent: archive.org_bot); You may ignore this block if you wish to archive the site nonetheless.
- Redirected: The seed URL has redirected the crawler to a different web address. When this happens, the new address is considered the seed URL and it appears in your seed status report. In the vast majority of cases, this leads to a subsequent and successful "Crawled" status.
- Unknown: There are many numbered derivations of this error generated directly by our crawler. To learn more about these and other, less common, status messages, see Understanding seed status.
Comments
0 comments
Please sign in to leave a comment.