YouTube not capturing video content

Comments

9 comments

  • Avatar
    Nicole Greenhouse

    Did the videos capture in the crawl report? You can check that in the file types tab.

    0
    Comment actions Permalink
  • Avatar
    Sarah Newhouse (Edited )

    Aha, thank you. Ok, so no, it looks like we didn't get any video/mp4 files -- but I don't know why or what I should do differently with the next crawl. 

    For what it's worth, I also recently ran a test crawl of a YouTube seed that we've been crawling for years, with the same rules and settings as this crawl, and that one captured all the new-since-the-last-crawl videos successfully. 

    0
    Comment actions Permalink
  • Avatar
    Sarah Weeks

    Hi Sarah - I do think your crawl parameters are correct. Brozzler, and the seed scope rules that get added automatically, are essential. So, the other seed that you've been crawling for years - it's had years of crawls to capture all its videos. And, are the two seeds in question newly capturing all those videos? I think it might just need more time. You can give it 3 days or even a week.
    I once had a youtube channel that couldn't be captured because it was incompatible with youtube-dl: (the colon is part of it), which shows up as a host. Your seeds both have youtube-dl: as a host, so that's essential, and good. You might comb through the A-I youtube help article - I do that OFTEN to catch things I've forgotten!
    My computer/internet is slow right now so I haven't been able to load your full reports, but comment here again if you try a longer crawl and it doesn't work, and I am happy to look further. I've QA'd a lot of youtube crawls! :)

    0
    Comment actions Permalink
  • Avatar
    Sarah Newhouse

    Thanks! I set it to run for a full day and it ended 6 hours after it started, so I'm not sure that's the issue. Also, the report just said "Finished" instead of "Finished: Time Limit."

    Does that still sound like it just needs more time? I'm happy to just run another test crawl, but I'm not totally convinced that's the issue given that the first one didn't run for a full 24 hours.

    0
    Comment actions Permalink
  • Avatar
    Karl Blumenthal

    Thanks for flagging this Sarah! You did indeed configure your crawls correctly. It looks like our A/V collecting utility needs an update to help it collect those missing video files again. I will update this thread again when we see more complete results from YouTube reliably. You can track our progress to that end on our status page for social media and other platforms.

    0
    Comment actions Permalink
  • Avatar
    Amy Blau

    I've been having the same problem in collecting any YouTube videos whatsoever in the last month or so, which I assume it is due to the same issue. Thanks for the question and the answer!

    0
    Comment actions Permalink
  • Avatar
    Janice Banser

    I just found this thread after trying to figure out why YouTube videos weren't working as expected.  I am glad it is being investigated and hope it is resolved soon. Is there an update or expected timeline for a fix?

    0
    Comment actions Permalink
  • Avatar
    Karl Blumenthal

    Thank you all for your patience on this one -- I think we're back in business! Tests since our latest upgrades are archiving these seeds and videos successfully again, so I recommend re-crawling any that gave you trouble in the last month or so, with the usual recommended scoping and Brozzler option enabled. 

    Sarah, I also recommend changing your seed URL slightly so that it appears exactly like this (note the lack of a trailing / slash at the end): https://www.youtube.com/@lsfoundation/videos

    And please let us know here or directly if you encounter any further obstacles from YouTube.

    1
    Comment actions Permalink
  • Avatar
    Sarah Newhouse

    Hi Karl -- I'm still having the same (or at least very similar) issues on new crawls that I ran on February 5th with the same scoping and limitations, plus your suggestion about removing the end slash from the URLs. Still no mp4s showing up in the file type list.

    https://wayback.archive-it.org/10030-test/20240205171434/https://www.youtube.com/@lsfoundation/videos 

    https://wayback.archive-it.org/10030-test/20240205171128/https://www.youtube.com/@lsfoundation 

     

    Open to any and all suggestions!

     

    1
    Comment actions Permalink

Please sign in to leave a comment.