How two-stage retrieval works
First-stage retrieval pulls a larger candidate set from vector search, hybrid search, or keyword search. The reranker then scores those candidates against the query and selects the best few chunks for the answer model.
- Retrieve more candidates than you plan to show the model.
- Rerank down to the smallest evidence set that preserves answer quality.
- Log pre-rerank and post-rerank document IDs for debugging.