Release Date: October 11, 2011
We have improved the Scope-It tool to include the ability to Add or Edit rules in bulk. The check boxes on the left hand side will allow you to select one or more hosts to add rules to, which is especially useful if you would like to add the same rule to a number of hosts at once. After selecting the checkboxes of the hosts you would like to add a rule for, click the 'Edit Constraints' button to input the desired rule.
Ability to Activate or Deactivate "expand scope" rules
We have added the ability to activate or deactivate "expand scope" rules. Since expand scope rules apply at the collection level, activating and deactivating them will allow you to more easily control which crawls each rule applies to. Click the Activate or Deactivate link next to each Expand Scope rule. Deactivated rules will not be applied to any crawls run in that collection unless they are later re-activated.
Display Seed Metadata in the wayback banner 'metadata' link
Seed level metadata will now display when you click on the "Metadata" link in the Wayback banner for your seed URL. If you have document metadata and seed metadata for the same URL, the seed metadata will appear, otherwise there is no change to the display of document level metadata.
Add an Image for each of your Collections
In addition to being able to upload a logo or image to represent your organization, you can now upload an image for each of your collections, which will display on the collection page on the new archive-it.org public site. In order to upload this image, click on the "Edit Collection Metadata" link on the Collection Management screen and then click on the "Images" tab. You can upload any image you like, but it will be resized to fit the space provided.
The image will be displayed on the collection listing page for the redesign of Archive-It.org launching on Oct 24. If you do not upload an image, then the image you have uploaded in the Admin --> Logo area for your account will be displayed. If no account level logo has been added, it will default to showing the Archive-It logo. If your organization or collection does not have a logo associated with it, we'd recommend taking a screenshot of one of your archived sites and using that as your image.
QA Report Update - Now with Patch Crawls
We have made improvements to our QA report which will allow you to easily run a 'patch' crawl that will capture any embedded content that was not initially captured from your seed urls. When you view a QA report for one of your crawls you will see a button that says "Run Patch Crawl". This will allow you to capture each of the embedded URLs that were not captured for the seed URLs in your crawl. If you have the "Ignore Robots.txt" feature enabled for your account, you will also see a checkbox that will allow you to capture embedded content that was not captured due to robots.txt blocks.
The patch crawl will run just like a regular crawl, and can be monitored from the "Current Crawls" area of your account. Once the patch crawl has finished, a full set of reports will be generated for the patch crawl, however you will not be able to run a QA Report on your patch crawl. Instead, the QA Report area for your patch crawl will link to the original QA Report.
With this update, we've also graduated the QA Report out of BETA status.
Downloadable Report for the "Reports" page
When viewing the "Reports" area within your account, you will now be able to download a spreadsheet that will aggregate the information on the "Summary" reports for each of the crawls in your account, allowing you to easily sort document and data totals by collection, date of capture, etc. Just click on the 'Download Report Summary' link in the upper right corner of your Reports page.
Redesign of Archive-It.org - improved access and browsing functionality
We are releasing a Beta of our new archive-it.org public site, which has new and improved functionality and public access to your collections! The Beta site will be available for partner feedback from October 12-17. After incorporating feedback, the new public site will be live at www.archive-it.org starting Monday, October 24.
We will be sending out information on how to view and provide feedback for this site at the end of the day on October 12.
Improvements to our OAI-PMH feed
We are improving our OAI-PMH feed to make it easier for your collection level metadata to be harvested into WorldCat quarterly. Since we want to give you the choice of which specific collections' metadata is harvested, you will need to opt in if you would like your collection metadata included. You can opt in by going to your Collection Metadata page and checking the box in the OAI Settings checkbox.
The new OAI-PMH feed will also make it easier for you to transform your collection level metadata from Dublin Core to other formats, such as MARC.
Instructions for how to deploy an installation of the Wayback software locally at your institution
We have developed fairly straightforward, step-by-step instructions for installing Wayback software locally at your institution, to enable those of you who receive a copy of your data to browse that content locally, if you desire. Please note that some technical expertise is still required to install and run Wayback locally. To access this documentation please see:https://webarchive.jira.com/wiki/display/wayback/Wayback+Installation+and+Configuration+Guide
Remove Wayback Access to Archived Content
We now have much more fine-tuned control when removing access to archived pages in Wayback. Previously this could only be done for a specific url (or urls). We now can easily block all captures of all pages of a site, or just block specific capture dates. If you have specific crawls or URLs that you would prefer not to have visible, you can inform the partner specialists (archive-itsupport at archive dot org) and we can remove access to this content for you.
Note: While the blocked urls will not appear when browsing in Wayback, these urls may sometimes appear in full text search results in the snippets. However, when a user clicks to view a search result for a blocked url, the archived page will not be available for viewing.
Other Updates you May Notice:
Help Documentation Reorganization
We've spent some time reorganizing our Help Documentation area, and hand we hope you find it more useful and easier to navigate. In particular, we've organized a large portion of the documentation into an Archive-It Guide section, which is also available as a large PDF file (ask the Partner Specialists if you are interested).