Archiving Instagram feeds

Overview

Instagram is a photo and video-sharing application and social networking service. This guide provides an overview of how to properly format, scope, and crawl Instagram seeds.

Known issues

Social media platforms like Instagram can be challenging to archive. Currently, Instagram has the following issues that we continue to actively monitor:

⚠️ Wayback replay of recent Instagram pages resolve to a blank logo page. For captures prior to ~May 2025:
- Instagram is blocking collection and Wayback replay beyond the 12-post default load page for most organizational and personal profile pages. Right-click to open individual posts in a new tab.
- If a page appears blank, wait 10-20 seconds to load the page. Click outside blank prompts to dismiss them. Replay media through the Wayback banner.
⚠️ If you are redirected to a 429 error or login page, content has not been collected and cannot be replayed. Try running a new crawl.

For a full list of known issues for archiving various platforms, see Status of monitored platforms.

On this page:

How to select and format your Instagram seeds
Scoping Instagram seeds
Running your crawl
What to expect from archived Instagram seeds

How to select and format your Instagram seeds

Be specific. Always include a specific user, followed by a / at the end. For example https://www.instagram.com/internetarchive/
Use the Standard seed type for Instagram seeds.

Scoping Instagram seeds

Default scoping for Instagram seeds

New Instagram seeds will have the default scoping rules automatically applied at the seed level when they are added to a collection. To learn more, including how you can add default scoping rules to existing seeds, visit Sites with automated scoping rules.

At the seed level, add a ignore robots.txt scoping rule. Note: Ignore Robots.txt is automatically to all new Instagram seeds.

-OR-
At the collection level, add a scoping rule to ignore robots.txt for the hosts www.instagram.com and fbcdn.net.

Running your crawl

Once you have finished selecting your seeds and adding recommended scoping rules, we highly recommend that you crawl your seeds using Brozzler.

What to expect from your archived Instagram seeds

Wayback replay of recent Instagram pages resolve to a blank logo page. Prior to ~May 2025:

When collected Wayback captures replay the default load (up to 12 posts) for most organizational and personal profile pages. Right-click to open individual posts in a new tab.
If a page appears blank, wait for 10-20 seconds to load the page. Click outside any blank prompts to dismiss them.

To playback media, use the Wayback banner's media link.

For posts, reels, and tagged feeds on organizational and personal profile pages, if you are redirected to a 429 error or login page, content has not been collected and cannot be replayed. Try running a new crawl.

Articles in this section

Overview

Known issues

How to select and format your Instagram seeds

Scoping Instagram seeds

Default scoping for Instagram seeds

Running your crawl

What to expect from your archived Instagram seeds

Comments

Articles in this section

Overview

Known issues

How to select and format your Instagram seeds

Scoping Instagram seeds

Default scoping for Instagram seeds

Running your crawl

What to expect from your archived Instagram seeds

Related articles