Postgres combines vectors and JSONB

AI engineer at Meta on evaluating Turbopuffer vs. Pinecone vs. Weaviate

Postgres has JSONB integration natively means it can handle dense and sparse pretty well on its own.

Analyzed 5 sources

The real advantage of Postgres here is that vectors can live next to the application’s normal data instead of in a separate retrieval system. A team can keep an embedding column for dense search, store tags and nested fields in JSONB, and then run one query that mixes vector similarity with filters like customer_id, document_type, or deeply nested metadata. That makes pgvector especially attractive for prototypes, internal tools, and products where the hard part is filtering business data, not serving massive search traffic.

1 sacra 2 postgresql 3 postgresql 4 sacra

JSONB is useful because Postgres can store nested JSON documents directly, test whether one JSON object contains another, and index those checks. In practice that means a document can carry flexible metadata without forcing every attribute into a fixed table schema.

2 postgresql 3 postgresql
Dense and sparse here means semantic vectors plus ordinary fields and keyword like metadata. Postgres handles the dense part through pgvector, and the sparse part through normal SQL, JSONB operators, and indexes. That is often enough when retrieval sits close to transactional data.

1 sacra 2 postgresql 4 sacra
The tradeoff is scale and specialization. The interview frames Postgres as the baseline for quick prototyping, while also noting that it scales less naturally across many machines. That is where systems like Turbopuffer or Elasticsearch become more attractive, when query volume, shard count, or search specific tuning starts to dominate.

1 sacra 4 sacra 5 elastic

Going forward, Postgres should keep winning the early and middle stage AI retrieval workload because it lets teams add semantic search without changing their whole data model. Purpose built retrieval systems will keep pulling ahead only when the workload becomes large enough that search infrastructure itself becomes the product’s main bottleneck.

1 sacra 4 sacra 5 elastic