"Site requires cookies to be enabled" message
I received a request to web archive some pages that are hosted by an outside vendor. These pages will soon be changing to under my organization's care, but it sounds like the pages are being re-done and will be deleted from this vendor's site. I was sent about 10 sites.
A few example links are:
When you actually click on those links, they become much longer, for example: https://getinfo.cps.gwu.edu/cyber-security-nova/WG02.html?SessionGuid=ec7a637d-fb0f-4538-b7b5-50241f7d151f
I set up the crawls to get the sites, and first the robot.txt stopped the crawls. I got that figured out with a scope rule, and ran the crawls again. Now when I click to view the completed crawls I get the message, "This site requires cookies to be enabled in your browser. Please enable and try again."
Is there anything to be done about this? Do I have to run the crawls again or has the data actually been captured, I just have to change some computer settings to actually view them? I have never received this message before.
Thank you,
Brigette
Brigette
-
Hi Brigette!
Have you tried collecting these seeds with Brozzler yet? I gave one of them a quick one-page test today and it appeared to capture the site without all that extra session-specific redirection and confusion: https://wayback.archive-it.org/7313/20210706162531/https://getinfo.cps.gwu.edu/ps-a

Let us know if that doesn't do the trick for you too though, and we'd be happy to take a closer look.
Please sign in to leave a comment.
Comments
1 comment