The 2022 Archive-It Virtual Partner Meeting was held on November 2, 2022 [agenda]

Presentations

Archiving and Data Services Update, Jefferson Bailey and Thomas Padilla, Internet Archive

Archives Unleashed Cohort Program, Samantha Fritz, Archives Unleashed

Using Web Archives to Assess Municipal Responses to the COVID 19 Pandemic, Cal Murgu, Brock University

State Elections Web Archive, Miranda Siler, Ivy Plus Library Confederation, and Ryan Denniston, Duke University Libraries

[slides], [recording]

Web Archiving Workflows with GitLab, Robert Manley, Washington University in St. Louis

[slides], [recording]

Panel

Partners Valerie Collins (University of Minnesota), Dan Johnson (University of Iowa), and Dan Noonan (The Ohio State University) discuss web archiving workflows at their institutions with Web Archivist, Karl-Rainer Blumenthal.
[recording]

Resources Shared during the Archive-It Virtual Partner Meeting 2022 [PDF]

Breakout Discussions

Notes from the Jamboards available in Breakout Session Rooms during the final hour of the event are summarized here, if available:

Archives Access

Features that would help enhance access included:

Ways to see changes in a URL over time
Dashboards to organize collections in different ways
Auto-citation within public seeds
Ability to see who else is collecting the URL and share notes

Currently, users can access collections with finding aids at some institutions.

Quality Assurance

Approaches to QA varied, some worked on teams, others worked alone. In general, the time-consuming aspect came up, especially on special projects for stakeholders. Knowing when enough is enough on frequent crawls can be challenging.

Most institutions used spreadsheets to help coordinate their QA workflows. Some additional tools mentioned included GitLab's Project Management feature, Filemaker Pro, Conifer, and Access Databases.

Advice to New Users conducting QA included:

Be patient, don't get frustrated
It gets easier as you spot common issues
Use the crawl report to scope seeds (especially Regular Expressions for crawler traps found in Hosts Reports)
Use data limits on crawls to control data budgets
Use the Help Center and Video Tutorials
Submit Support Tickets and use Live Chat Support

Social Media

Attendees wanted to collect content on these platforms because that content wasn't present anywhere else on the web, especially content from younger people or smaller organizations only on Facebook or Instagram.

They were also interested in how other institutions were approaching copyright issues on these platforms and finding workarounds for blocks currently on some of these platforms (Grahmir mentioned for Instagram).

Vault

Attendees mentioned currently using Lib Nova and Preservica (as a pilot, but concerned about sustainability). Blockers to their current digital preservation work included immediate management and their budget (with respect to the business models of other vendors).

Articles in this section

Archive-It Virtual Partner Meeting, 2022

Presentations

Panel

Breakout Discussions

Archives Access

Quality Assurance

Social Media

Vault

Comments

Articles in this section

Presentations

Panel

Breakout Discussions

Archives Access

Quality Assurance

Social Media

Vault

Related articles