To Successfully Crawl Instagram Seeds:
- Be specific. Always include a specific user, followed by a / at the end. For example https://www.instagram.com/internetarchive/
- Add a collection level scoping rule to ignore robots.txt for the hosts www.instagram.com and fbcdn.net -OR- Ignore robots.txt at the seed level.
Your archived Instagram pages should play back accurately in Wayback with the following exceptions:
- We are at present only able to replay first two scrolls of the dynamically loading content for Instagram pages.
- You may not be able to enlarge images or see their comments and likes without first detecting and patch crawling them through Wayback QA.