CH11 Reranker
Reranker is a key component used in the modern two-step search system (Two-Stage Retrieval System). Designed to perform efficient and accurate searches on large datasets, it primarily serves to re-rank the documents found by Retriever, the first step.
summary
Reranker works in the second stage of the search system, aiming to improve the accuracy of the initial search results. After Retriever quickly extracts relevant candidate documents from a large set of documents, Reranker analyzes these candidate documents more elaborately to determine the final ranking.
How it works
Receive initial search results from Retriever.
Queries and each candidate document are paired to process.
Evaluate the relevance of each query-document pair using complex models (mainly transformer based).
Readjust documents according to evaluation results.
Outputs the final resorted result.
Technical features
architecture
Mainly using transformer-based models such as BERT and RoBERTa
Cross-encoder structure adoption
Input format
Generally
[CLS] Query [SEP] Document [SEP]In form
Learning method
Pointwise: predict the relevance score of individual query-document pairs
Pairwise: Comparison of relative relevance between two documents
Listwise: Optimize the entire ranking list at once
Difference from Retriever
characteristic
Retriever
Reranker
purpose
Quick search for related documents
Accurate ranking
Processing method
Simple similarity calculation
Complex semantic analysis
Model structure
Single encoder
Cross encoder
Operational complexity
low
High
Priority
speed
accuracy
Input form
Query and document individual processing
Query-document pair processing
output
Large set of candidate documents
Exact rank and score
scalability
High
Limited
pros and cons
Advantages
Significant improvement in search accuracy
Complex semantic relationship modeling possible
Complementing the limits of the first-step search
Disadvantages
Calculation cost increase
Processing time increase
Difficulty applying directly to large data sets
Last updated