What is a HTTP 999 error

Comments

3 comments

  • Official comment
    Avatar
    Mary Haberle (Edited )

    Hi Gabrielle,

    A 999 error is a form of user agent block that we’ve noticed linkedin.com seeds periodically return. Sometimes, the seed status in the crawl report will read “Crawled (HTTP error 999)” and still return a valid capture, other times the seed will have to be re-crawled. Due to the occasional nature of this issue, we haven’t developed a site-specific linkedin.com page for our help center.

    Our best advice when crawling linkedin.com pages, is to do the following:

    1. Add a seed-level ignore robots.txt rule
    2. Add login credentials to your seed’s settings, if the content you wish to capture isn’t publicly available on the live web
    3. Test crawl since you may not have a successful capture on the first try
    4. Review your crawl results in Wayback and run a new test crawl, if you did not get a successful capture

    Thanks for the great question!

    Mary

    Comment actions Permalink
  • Avatar
    Julia Welby

    Have there been any updates for capturing LinkedIn pages?

    0
    Comment actions Permalink
  • Avatar
    Karl Blumenthal

    Hi, Julia. Mary's advice above is still current. I'd recommend starting by going through her steps, 1-4, if you have not already. If this fails to archive a specific LinkedIn seed though, then please feel free to share it with us directly and we can take a closer look with you. 

    0
    Comment actions Permalink

Please sign in to leave a comment.