Social media platforms update frequently. For current information on any known issues archiving Instagram content, please see our System Status page.
New Instagram seeds will have the default scoping rules automatically applied at the seed level when they are added to a collection. To learn more, including how you can add default scoping rules to existing seeds, please visit Sites with automated scoping rules.
To Successfully Crawl Instagram Seeds:
- Be specific. Always include a specific user, followed by a / at the end. For example https://www.instagram.com/internetarchive/
- Use the Standard seed type for Instagram seeds
- Ignore robots.txt at the seed level -OR- Add a collection level scoping rule to ignore robots.txt for the hosts www.instagram.com and fbcdn.net. Ignore Robots.txt will be added automatically to all new Instagram seeds.
- Crawl with Brozzler
What to expect from your archived Instagram seeds:
- Captures made by standard Archive-It crawling technology replay the default load (up to 12 images) of content on Instagram feeds. Use Brozzler to create captures that scroll beyond the default load in Wayback replay.
Comments
0 comments
Please sign in to leave a comment.