Regular expressions (regex) are used to find strings of characters. In Archive-It, regular expressions can be used to tell our crawler what kind of links within a given host you want to block from or specifically include in your crawls. You can, for example, block any calendar pages from being crawled on a specified host, but still crawl the rest of the site.
Learn more about regular expressions: How to modify your crawl scope with a Regular Expression
Comments
0 comments
Please sign in to leave a comment.