CH11 Reranker

Reranker is a key component used in the modern two-step search system (Two-Stage Retrieval System). Designed to perform efficient and accurate searches on large datasets, it primarily serves to re-rank the documents found by Retriever, the first step.

summary

Reranker works in the second stage of the search system, aiming to improve the accuracy of the initial search results. After Retriever quickly extracts relevant candidate documents from a large set of documents, Reranker analyzes these candidate documents more elaborately to determine the final ranking.

How it works

Receive initial search results from Retriever.
Queries and each candidate document are paired to process.
Evaluate the relevance of each query-document pair using complex models (mainly transformer based).
Readjust documents according to evaluation results.
Outputs the final resorted result.

Technical features

architecture

Mainly using transformer-based models such as BERT and RoBERTa
Cross-encoder structure adoption

Input format

Generally [CLS] Query [SEP] Document [SEP] In form

Learning method

Pointwise: predict the relevance score of individual query-document pairs
Pairwise: Comparison of relative relevance between two documents
Listwise: Optimize the entire ranking list at once

Difference from Retriever

characteristic

Retriever

Reranker

purpose

Quick search for related documents

Accurate ranking

Processing method

Simple similarity calculation

Complex semantic analysis

Model structure

Single encoder

Cross encoder

Operational complexity

low

High

Priority

speed

accuracy

Input form

Query and document individual processing

Query-document pair processing

output

Large set of candidate documents

Exact rank and score

scalability

High

Limited

pros and cons

Advantages

Significant improvement in search accuracy
Complex semantic relationship modeling possible
Complementing the limits of the first-step search

Disadvantages

Calculation cost increase
Processing time increase
Difficulty applying directly to large data sets

Previous11. EnsembleRetriever with Convex Combination (CC)Next01. Cross Encoder Reranker

Last updated 5 months ago