You may encounter difficulty archiving sites protected by Cloudflare. Cloudflare security products mitigate automated threats to websites. Users can configure them to allow Archive-It’s tools to collect their sites.
Detecting a Cloudflare block
To determine if Cloudflare blocks your crawls:
- Check your Seeds report. Blocked sites will report a "Crawled (HTTP error 403)" seed status.
- Review your results in Wayback mode for Cloudflare-branded error messages.
Configuring Cloudflare to allow archiving
Cloudflare users must configure a custom rule to allow Archive-It's collecting tools:
- Instructions from Cloudflare: Configure a custom rule with the Skip action
- More about Cloudflare's firewall tools: Cloudflare Web Application Firewall
Contact Archive-It support for the latest recommendations to configure your rules with Cloudflare.
What to expect from archived Cloudflare sites
You may see “403 (Forbidden)” or Cloudflare-branded error messages in replay. Contact Archive-It support to remove end user access to these errors.
Please sign in to leave a comment.