Move content between collections
I notice that moving a capture between collections is on the roadmap, though still 'Planned' for Q1 2019. Is there any update on this effort?
Thanks – Walker
-
Thanks for tagging that one, Walker! Our apologies for the mixed message, but it should read "2020," of course. We've fixed the typo. This week's scheduled downtime will in fact enable us to lay the foundation for this coming feature, so we're still looking forward to sharing more updates in Q1 2020.
-
Hi,
[Jefferson (department Director) responding, though posting this using our generic shared/unattributed user account].
Thanks for the inquiries. The original Archive-It 2020 development roadmap has, like most everyone's original 2020 plans I presume, been impacted by recent events, in our case both the obvious one but also an internal one.
1. COVID mitigation (the obvious externality). The importance of documenting the pandemic via web archiving plus the ability of Archive-It users to continue (and in most cases drastically increase) use of the web application while working at home both lead to an exponential increase across most Archive-It usage metrics in Q1-Q2. As an example, in early-pandemic months, there were 4 to 5 times as many crawls running per day in Archive-It overall as the average volume in Jan/Feb or 2019. Similar immediate, exponential, growth could be seen across nearly all other stats such as data harvested, new collections/seeds, support tickets, et cetera. (Even now, Archive-It use is running 2-3x pre-pandemic levels). This is all a good thing! Go go go Archive-It users! And efforts such as our COVID Special Campaign (https://archive-it.org/blog/covid-19/) were to encourage and facilitate more institutions archiving more content and data. That said, while our infrastructure is somewhat elastic to account for bursts of heightened activity (and reminder that we own/operate our own data centers), an overnight 4-5x growth in harvesting/processing is not without some downstream impact on infrastructure and staff. As such, we needed to purchase new hardware and add it to the cluster far ahead of schedule and this -- plus keeping existing systems running under much heaver use/load -- consumed the time of engineering teams that, in normal times, would have focused instead on the new development and features originally planned for 2020. Thus, near-overnight scaling of infrastructure had to be prioritized over the original feature development roadmap.
2. Engineering hiring (the not-obvious internality). A few engineering vacancies also limited the availability of staff resources to focus on feature dev at the same time as work on operational growth and service resiliency. We are in the late or final stages of hiring three engineering roles. Some of these new hires will be working on the "move seeds between collections" feature, so we expect work to commence soon after onboarding and training.
tl;dr -- the original 2020 roadmap was disrupted by COVID-related growth and some team transitions that caused us to prioritize scaling infrastructure and services over originally-planned feature development. We expect work on "move seeds" to commence in Q3-Q4 with a hope for a Q1 2021 release. We will update the overall roadmap to account for the new reality. In the meantime, stay safe and keep on crawling!
Please sign in to leave a comment.
Comments
5 comments