Overview
Partners subscribe to the Archive-It service at a level determined by their annual data budget. This budget determines how much total data you can archive from the web (for example, 256 GB, 1 TB, 2 TB) in a single subscription year, with the data resetting to zero upon renewal. Regularly monitoring your data usage in a subscription year is an important part of planning your crawls and saving only so much data as you can afford based on your service agreement with Archive-It.
On this page:
- How is data usage calculated
- How to review subscription data usage
- How to review scheduled crawls
- Budget Alert Banners
How is data usage calculated
The total data in your account is the sum of the New Data (as listed in the New Data column of your crawl report) captured in saved test crawls, One-Time crawls, scheduled crawls, and patch crawls, as well as any uploaded WARCs. Crawl data will apply to the budget of the subscription period in which a crawl finished. Data from unsaved, deleted, or expired Test crawls does not count against your data budget.
How to review subscription data usage
Your account's home page displays the most complete and up-to-date information on the total data archived by your account.
- The Current Subscription chart displays the total of used and remaining data in your crawl budget for your current subscription period.
- Expanding Current Subscription Details lists your data budget, new document and data amounts, and subscription start and end dates.
- Expanding Past Subscription Totals shows total data and total documents for each past subscription year as well as all-time totals.
Note: For partners with multi-year service agreements, the figures and graphic above still reflect the data budget for a single year of that agreement.
How to review scheduled crawls
For a complete picture of your data budget, it is important to identify any crawls that are scheduled to run on a regular basis, and review their crawling history to get a sense of how much data they are likely to use in your current subscription year.
- To review scheduled crawls across all of your collections, select Crawls in the black navigation bar and then select the Scheduled Crawls tab.
- In the Collection column, click each collection that has a scheduled crawl.
- In each collection, select Crawls.
- The Crawl Frequency sub-tab displays when each crawl is scheduled to run next.
- The Crawl Reports sub-tab displays how much New Data each crawl has accumulated over time. Select the Frequency gear icon to view the crawling history of each scheduled interval. If a crawl has a recent significant increase to its New Data amount, you should review the crawl report, adjust scoping, and run a test crawl prior to its next scheduled crawl.
- Check scheduled crawls periodically to ensure that they continue to be scoped appropriately.
Budget alert banners
You are nearing your data limit
When your account uses 80% or more of its current subscription year's data budget, an alert banner appears on your account landing page. If you see an alert banner:
- Follow the steps above to ensure that you do not exceed your annual allowance and
- Communicate with us about the steps you have taken by submitting a support ticket. Include any questions you have about your data budget.
You are over your data limit
When your account exceeds your annual data budget, an over budget alert banner appears on your account landing page. If you see this banner, submit a support ticket.
If you have any difficulty or questions about efficiently managing your data budget to support your collecting scope, contact Archive-It's Web Archivists for assistance.
Comments
1 comment
Dear,..
I understand its a little, but I'm deaf, can't ear by sound. Will help me & traning to understand one by one in a few week then done. If can do....
Thanks,....
Please sign in to leave a comment.