Neural Search Talks  — ColBERT + ColBERTv2: late interaction at a reasonable inference cost
In this episode of Neural Search Talks, Andrew Yates (Assistant Professor at the University of Amsterdam) and Sergi Castella (Analyst at Zeta Alpha) discuss the two influential papers introducing ColBERT (from 2020) and ColBERT v2 (from 2022), which mainly propose a fast late interaction operation that isn't parametrized, to achieve a performance close to full cross-encoders but at a much lower computational cost at inference; along with many other optimizations.
📄 ColBERT: "ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT" by Omar Khattab and Matei Zaharia.
📄 ColBERTv2: "ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction" by Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, Christopher Potts, and Matei Zaharia.
📄 PLAID: "An Efficient Engine for Late Interaction Retrieval" by Keshav Santhanam, Omar Khattab, Christopher Potts, and Matei Zaharia.
📄 CEDR: "CEDR: Contextualized Embeddings for Document Ranking" by Sean MacAvaney, Andrew Yates, Arman Cohan, and Nazli Goharian.
00:38 Why ColBERT?
03:30 Retrieval paradigms recap
08:00 ColBERT query formulation and architecture
09:00 Using ColBERT as a reranker or as an end-to-end retriever
11:24 Space Footprint vs. MRR on MS MARCO
12:20 Methodology: datasets and negative sampling
14:33 Terminology for cross encoders, interaction-based models, etc.
16:08 Results (ColBERT v1) on MS MARCO
18:37 Ablations on model components
20:30 Max pooling vs. mean pooling
22:50 Why did ColBERT have a big impact?
26:27 ColBERTv2: knowledge distillation
29:30 ColBERTv2: indexing improvements
33:55 Effects of clustering compression in performance
35:15 Results (ColBERT v2): MS MARCO
38:50 Results (ColBERT v2): BEIR
41:23 Takeaway: strong specially in out-of-domain evaluation
43:58 Qualitatively how do ColBERT scores look like?
46:17 What's the most promising of all current neural IR paradigms
49:30 How come there's still so much interest in Dense retrieval?
51:05 Many to many similarity at different granularities
53:40 What would ColBERT v3 include?
56:35 PLAID: An Efficient Engine for Late Interaction Retrieval