The 2022 Archive-It Virtual Partner Meeting was held on November 2, 2022 [agenda]
Archiving and Data Services Update, Jefferson Bailey and Thomas Padilla, Internet Archive
Archives Unleashed Cohort Program, Samantha Fritz, Archives Unleashed
Using Web Archives to Assess Municipal Responses to the COVID 19 Pandemic, Cal Murgu, Brock University
State Elections Web Archive, Miranda Siler, Ivy Plus Library Confederation, and Ryan Denniston, Duke University Libraries
Web Archiving Workflows with GitLab, Robert Manley, Washington University in St. Louis
Partners Valerie Collins (University of Minnesota), Dan Johnson (University of Iowa), and Dan Noonan (The Ohio State University) discuss web archiving workflows at their institutions with Web Archivist, Karl-Rainer Blumenthal.
Resources Shared during the Archive-It Virtual Partner Meeting 2022 [PDF]
Notes from the Jamboards available in Breakout Session Rooms during the final hour of the event are summarized here, if available:
Features that would help enhance access included:
- Ways to see changes in a URL over time
- Dashboards to organize collections in different ways
- Auto-citation within public seeds
- Ability to see who else is collecting the URL and share notes
Currently, users can access collections with finding aids at some institutions.
Approaches to QA varied, some worked on teams, others worked alone. In general, the time-consuming aspect came up, especially on special projects for stakeholders. Knowing when enough is enough on frequent crawls can be challenging.
Most institutions used spreadsheets to help coordinate their QA workflows. Some additional tools mentioned included GitLab's Project Management feature, Filemaker Pro, Conifer, and Access Databases.
Advice to New Users conducting QA included:
- Be patient, don't get frustrated
- It gets easier as you spot common issues
- Use the crawl report to scope seeds (especially Regular Expressions for crawler traps found in Hosts Reports)
- Use data limits on crawls to control data budgets
- Use the Help Center and Video Tutorials
- Submit Support Tickets and use Live Chat Support
Attendees wanted to collect content on these platforms because that content wasn't present anywhere else on the web, especially content from younger people or smaller organizations only on Facebook or Instagram.
They were also interested in how other institutions were approaching copyright issues on these platforms and finding workarounds for blocks currently on some of these platforms (Grahmir mentioned for Instagram).
Attendees mentioned currently using Lib Nova and Preservica (as a pilot, but concerned about sustainability). Blockers to their current digital preservation work included immediate management and their budget (with respect to the business models of other vendors).