Archiving Bluesky, Threads, and Mastodon
Is archiving Bluesky, Threads, or Mastodon a high priority in your collection development plan? Comment below to let us know!
Here’s what you can currently expect* when crawling Bluesky, Threads, and Mastodon:
Bluesky
✅ Collection
- Most posts, images, and audio/video by profile owners are collected.
- Feed reposts and comments are out of scope. Expanding your scope to include https://bsky.app/profile will collect extraneous documents and is not recommended.
- Run your crawl using Brozzler crawling technology.
❌ Replay
- Feeds and posts cannot be replayed in Wayback. Captures resolve to a ‘page not found’ or ‘XRPCNotSupported’ error.
- Fixing replay issues is not currently a priority.
Threads
⚠️ Collection
- Feeds for Threads, Replies, and Reposts that are visible to non-logged-in users are collected.
- Individual posts by the profile owner are collected.
- To collect Replies and Reposts feeds, add them as private helper seeds (e.g. https://www.threads.net/@[profilename]/replies/).
- Run your crawl using Brozzler crawling technology.
⚠️ Replay
- Feeds for Threads, Replies, and Reposts that are visible to non‑logged‑in users may sometimes replay correctly in Wayback, and other times load only a blank logo page.
- In Threads feeds, clicking a post by the profile owner may fail to open it or may return a “something went wrong” error; refreshing your browser may load the post and a limited set of comments.
Mastodon
⚠️ Collection
- Partial or full Posts, Posts and replies, and Media feeds are collected.
- Some individual posts by profile owners are collected.
- Run your crawl using Brozzler crawling technology.
⚠️ Replay
Update: Newer Mastodon versions show a higher rate of replay failures. Earlier captures may work; later ones frequently fail. Fixing these replay issues is not currently a priority.
- Partial or full Posts, Posts and replies, and Media feeds can be replayed in Wayback (example).
- Some individual posts by profile owners and comments can be replayed.
- Infinite scroll works for some feeds.
- Some media can be replayed via the banner (example).
*Social media platforms continuously evolve. We’ll update this post with any changes.
-
Hi, Katherine Crowe. We're investigating whether a global Wayback Machine solution can be integrated into Archive-It's custom Wayback software. This work has not yet been prioritized. If preservation is important, we recommend using the Wayback Machine's Save Page Now feature https://web.archive.org/save.
Please sign in to leave a comment.
Comments
5 comments