On this page
- What are Crawl Schedules
- Create a Schedule
- How to add or remove seeds from a Schedule
- Run a crawl schedule and set up future crawls
- Edit a schedule
- Deactivate or delete a schedule
What are Crawl Schedules
Crawl Schedules control how often your seeds are crawled. Each schedule has its own frequency, limits, and crawling technology. Any seed attached to a schedule gets crawled at that schedule's frequency.
Create a schedule
- Go to a collection's Overview page, its Scheduled Crawls page, or the account-level Scheduled Crawls page.
-
Click Create Schedule.
- In the setup modal, fill in:
- Name - something unique to that schedule (this will appear in your Crawl Reports list)
- Frequency - how often to crawl
- Limits - document count, data size, and time
- PDF only - toggle on if you only want PDFs
- Crawling technology - select Brozzler or Standard
- Click Create. The schedule will appear in your Crawl Schedule list.
Add or remove seeds from a schedule
To add seeds to a schedule:
Before you begin, make sure your schedules are already set up.
- Select seeds from the Seeds list → click Edit Settings → choose a schedule from the Schedule dropdown, or
- Click a seed URL → go to its Settings page → choose a schedule from the Schedule dropdown.
To remove or change a seed's schedule:
- Use the Schedule dropdown on the seed's Settings page or the bulk Edit Seed Settings modal.
- Choose a different schedule to reassign, or choose No Schedule to remove it from all schedules.
Run a crawl schedule and set up future crawls
-
Click Set Date on the Scheduled Crawls table (on the Collection Overview or Crawl Scheduled Crawls tab).
- Choose either:
- Crawl Now to start a crawl immediately and set up recurring crawls
- Select a Start Time to pick a future date/time for the first crawl; recurring crawls follow from there
Edit a Schedule
Click the schedule's name to update its name, frequency, limits, or crawling technology.
Deactivate or delete a schedule
You can deactivate or delete schedules from the Schedules table.
Deactivate
Click the blue Active toggle. This cancels all future crawls for that schedule (seeds remain attached).
Delete
Only schedules with no seeds attached can be deleted. If the trash icon appears next to the Active toggle, click it to delete.
Comments
0 comments
Please sign in to leave a comment.