Archive-It partners may quickly launch test or one-time crawls of new seeds directly from the home page of the web application by using the “InstaCrawl” feature. InstaCrawl enables partners to immediately add and crawl new seeds in new or existing collections without going through existing collection creation and management workflows. As soon as the crawl begins, these new seeds or collections will appear in the web application like any others for later management.
To use the InstaCrawl feature, begin by clicking the button on the Home screen:
In the pop-up dialog, select an existing collection or name a new one to which to add your seeds. Enter the seed URLs in the box as you would add them to any normal collection:
Optional: You may at this time also set “Advanced” options like the seeds’ public/private accessibility, the frequency to which they will be assigned moving forward, and their seed type. By default, new seeds will otherwise be publicly visible, set to the One-Time frequency, and of the “standard” type. Seeds added to a frequency already recurring in an existing collection will crawl automatically with others at that frequency. New recurring crawl frequencies need to be scheduled in addition to this one-time crawl.
When you have added all of your seeds to the list, click the “Set Limits” button to advance to crawl configurations. Like any manually launched crawl, select its test or one-time production type, apply any desired limits on data or documents to be archived, a time limit, and launch by clicking the “Crawl” button:
Once initiated, your crawl will be added to the “Current Crawls” lists in your account and your new or existing collection. You may monitor and review this crawl’s reports as you would any other Archive-It crawl.
Comments
0 comments
Please sign in to leave a comment.