Help archiving Instagram posts and comments

Comments

9 comments

  • Avatar
    Skip Kendall

    It appears that Instagram recently tweaked their code and caused this problem.  For months, our Instagram crawls were great and then a month or so ago, we started having problems.  I know that Archive-It is aware of the problem and is working on it but it isn't resolved yet (as far as I know).

    0
    Comment actions Permalink
  • Avatar
    Aimee Everrett

    Has anyone had any success getting successful Instagram crawls recently?

    0
    Comment actions Permalink
  • Avatar
    Skip Kendall

    Every once in a while, one works but they're generally not working.  I have a ticket in on it and it appears that we're generally capturing the content but can't currently replay it properly.  My check of the contents from the crawl report backs that up as I'm seeing plenty of images being captured. They are working on the problem but I haven't heard a time estimate for a fix.

    0
    Comment actions Permalink
  • Avatar
    Karl-Rainer Blumenthal

    Instagram captures should all replay as expected again with our latest tweak to Wayback this week (the fix should apply retroactively). Let us know here or get in touch directly with your example seeds if you encounter anything to the contrary. And thanks as always for your patience!

    0
    Comment actions Permalink
  • Avatar
    Ana Rogers

    Hi,

    I seem to be having the same problem discussed here. No images are loading on my instagram seed, any advice? Thanks! 

    https://wayback.archive-it.org/14021-test/20200503155008/https://www.instagram.com/fucknomtl/?hl=en

    1
    Comment actions Permalink
  • Avatar
    Julianna Barrera-Gomez

    I'm having a similar issue--my seed (crawled with Brozzler, new seeds with all the recommended scoping set up already) loads the Instagram feed and images, but it won't load any of the posts. If I click on these in a new tab I get the Not In Archive message. Then when I look at the QA View Missing Documents dashboard for the seed it will start loading up that URL for each post not crawled (only after I've attempted to look at them) and even when I run a patch crawl on a few of those missing URLs I still get nothing (or the URL loads an Insta camera icon only).

    Seed: https://wayback.archive-it.org/1924/20200515214923/https://www.instagram.com/pharmtable/

    Latest post patch crawled: https://wayback.archive-it.org/1924/20200616182446/https://www.instagram.com/p/CAN1PjMHeK_/

    Anyone have tips to share? And please--someone be up front--am I just not understanding how archiving Instagram works? Should I not be expecting that individual posts & comments would be captured? Is Brozzler just supposed to record images from the feed?  Thanks!!

    0
    Comment actions Permalink
  • Avatar
    Skip Kendall

    This is a new problem. They got it working briefly and then Instagram started requiring login to view the posts and comments. That's blocking the crawler right now. I put in a ticket last month and asked about entering username/password in seeds. Archive-It advised against that because of security tech on the Instagram side that might flag an account if it's being used from different locations (like Massachusetts and California). I was told that they are aware of the problem and are working on a solution.

    0
    Comment actions Permalink
  • Avatar
    Sylvie Rollason-Cass

    Hi Everyone, thanks for letting us know what you're seeing! A new issue impacting capture and replay of Instagram had popped up since our April update, and has since been fixed. When convenient, please take another look at your captures to check if they are still not replaying as expected. If you notice any Instagram captures made between 5/10/2020 - 6/13/2020 that continue to either replay blank or are only loading the Instagram logo please re-run your crawl. Remember to use Brozzler when capturing Instagram and make sure your seed URL is formatted without any language parameters at the end.

    Skip is correct, Instagram has started requiring users be logged in to view detail images/posts. We are working on a solution to this but for the time being detail images are not being captured.

    0
    Comment actions Permalink
  • Avatar
    Julianna Barrera-Gomez

    Thanks for the explanations!

    Sylvie Rollason-Cass, is Skip's comment that we should not attempt to crawl Instagram using login credentials still AIT's recommendation?

    0
    Comment actions Permalink

Please sign in to leave a comment.