Issues capturing YouTube user's videos
Hello,
We've been trying to capture various institutional YouTube channels since March or so, following the instructions for YouTube crawls here. That said, we keep getting the issue where it seems to capture all the content (indicated by a high data count and testing that the captured videos are as expected), but when you click on videos from the channel page, we get a "Not in Archive" error. The URL is always formatted like: https://accounts.google.com/watch?v=vAMYVdjVp4M, and everything after "watch?" matches the original YouTube URL. (Very occasionally, it won't result in a "Not in Archive" error, but take you to a totally blank page.)
We tried scoping in the accounts.google.com page, and while this resulted in a sort of wire-frame version of the YouTube watch page being captured, the video is still not being played back. We also have the seed formatted automatically as a YouTube video, so it should have the recommended settings already.
Does anyone know what might be happening here? Our hospital published a lot of COVID-19 information on YouTube, so this is a critical piece of our collection this year.
Thanks!
Stefana Breitwieser
Arthur Aufses, Jr., MD Archives
Icahn School of Medicine at Mount Sinai
-
Hi, Stefana. Apologies for the inconvenience here, but the good news is that you are doing everything correctly to collect and preserve the intended videos in the meantime. We are working actively here on our side to reconfigure the Wayback replay software to link to the proper and archived URL upon browsing, like this one for your example above, and to restore the "watch" page contents that accompany each video. Those videos are still accessible by way of the lightbox player linked in the Wayback banner message, so please do let us know if you see anything to the contrary there.
I'm updating Archive-It's system status page to track progress on this front, but I will report back here directly when we have an update or better yet a fix to review.
-
Curiouser and curiouser! :-) I got my link above from what I believe to be your latest test crawl, so if you have a specific example of another "watch page" where you do not see the lightbox video, then please submit us a ticket and we'll take look into it directly.
-
We have been having the same issues with YouTube the link changes from the wayback page twice before showing the "accounts.google" Stefana mentioned. The videos do not load correctly and in general are the same issues mentioned above.
Tatiana M. Swann
Archives Technician
Smithsonian Institution Archives
Please sign in to leave a comment.
Comments
5 comments