Retrieval vs Ranking in Vector Search

Diving deeper into

AI engineer at Meta on evaluating Turbopuffer vs. Pinecone vs. Weaviate

Interview
Turbopuffer and a lot of its competitors are great for raw retrieval, but feeding a ranking layer is a much harder problem.
Analyzed 4 sources

This marks the line between a vector database that finds likely candidates and a search stack that decides what actually deserves to be shown. Raw retrieval mostly returns nearest neighbors by vector or keyword score, but ranking needs many extra signals, like freshness, field matches, permissions, personalization, and downstream task fit. In the interviews, Turbopuffer is consistently described as strong at candidate generation and cost efficient storage, while custom ranking is where teams switch to proprietary systems or to Vespa.

  • The missing piece is feature generation at query time. A ranking layer needs more than one similarity score, it needs per document signals that can be combined by a reranker. The Meta engineer says retrieval scores are easy to get, but richer ranking features and custom model based feature emission are much harder without a proprietary workflow.
  • This is why TurboPuffer and Vespa end up serving different jobs. One large scale team uses TurboPuffer for generic customer answer agents because it is cheaper and simpler to operate, but uses Vespa when the product needs custom ranking, hybrid retrieval, and heavy personalization. That is the practical boundary between retrieval infrastructure and full search infrastructure.
  • The transparency problem also changes once ranking is added. Teams can usually inspect retrieval scores, but they still rely on their own tracing stack to understand full pipeline behavior, because backend tools tend to expose aggregate latency and limited ranking explanation. That makes managed retrieval easier to buy, but harder to tune for high value ranking workloads.

The category is moving toward a split market. Managed vector systems will keep winning broad RAG and agent use cases where cost and operational simplicity matter most. The next layer of value will shift toward systems that can emit richer features, support custom ranking logic, and make ranking decisions easier to inspect, because that is where product differentiation and monetization usually live.