Production crawls are indexed for full text search 7 days after a production crawl completes or a test crawl is saved. To do this we use Elasticsearch, which indexes every word on every archived page.
Elasticsearch calculates a relevance score for each archived document based on how often the search term appears in it, the frequency with which the search term appears throughout the entire index, and the length of the field that contains the search term. When the advanced search tool is used to create multiple query clauses, the score of each clause is combined to calculate an overall score for the document.
Result listings display the best matched page per host with the ability to drill down to view more results from each host. A count of the total number of results matching a query is indicated at the top of search results, however, to improve performance, the default number of viewable results is limited to 100 hosts, displayed across five pages.
To learn more about Elasticsearch, visit: https://www.elastic.co/guide/en/elasticsearch/guide/current/index.html
Comments
0 comments
Please sign in to leave a comment.