Revenue
$100.00M
2026
Revenue
Sacra estimates turbopuffer hit $100M in annualized revenue in March 2026, up from $75M at the end of 2025, approximately 2,400% year-over-year.
turbopuffer's revenue is usage-based, scaling with the volume of documents stored, writes processed, and queries served across customer namespaces. Because pricing aligns with AI workload growth, every new user codebase, workspace, or document corpus added by a customer translates into more storage and query volume, and revenue expands as customers index more of their data.
The customer base skews toward AI-native software companies. Cursor, which uses turbopuffer to index over a trillion chunks of code across more than 80 million namespaces, is the largest known customer and the original proof-of-concept for the architecture. Notion, which migrated its 10 billion-plus vector workload in October 2024 and subsequently removed per-user AI charges from its product as a direct result of the cost reduction, represents the enterprise tier.
Valuation & Funding
turbopuffer has raised less than $1M in total primary capital as of May 2026, per the CEO's statement.
In December 2025, turbopuffer closed a seed VC round with Thrive Capital joining as a new investor alongside Lachy Groom, who doubled down.
Product
turbopuffer is a serverless vector and full-text search database built on object storage. It is designed to make documents, codebases, workspaces, and records searchable for AI applications at a fraction of the cost of traditional in-memory search infrastructure.
The core architectural idea is to avoid keeping all search indexes loaded in RAM at all times, as first-generation vector databases like Pinecone did. turbopuffer stores data durably in object storage (S3, GCS, or Azure Blob) and pulls it into NVMe SSD cache or RAM only when queried. A namespace that has not been accessed in days costs almost nothing to store. When a user opens their Notion workspace or Cursor codebase, that namespace can inflate into cache within seconds and serve queries at single-digit to low-double-digit millisecond latency.
Developers integrate turbopuffer by writing documents into a namespace via API. Each document has an ID, optional vector embeddings, text fields, and metadata. The namespace is the unit of isolation, one per user codebase at Cursor, one per workspace at Notion, one per org or table at Linear. There is no practical limit on namespace count; turbopuffer has observed over 250 million namespaces in production.
Applications can then query data with dense vector ANN search for semantic similarity, BM25 full-text search for exact keyword matching, sparse vector search (added April 2026), attribute filtering with nested boolean logic, or combinations of these methods in a single request. Hybrid search, which runs both a vector search and a BM25 search and fuses the results, is the dominant production pattern for AI retrieval because pure semantic search misses exact tokens like class names, error codes, and customer IDs.
The vector index is based on SPFresh rather than the HNSW graphs used by most first-generation vector databases. HNSW requires ten to twenty sequential round trips to traverse its graph structure, which makes it poorly suited to object storage latency. SPFresh needs only two to four round trips, download centroids, identify closest clusters, fetch cluster data, which is why turbopuffer can serve competitive query latency from NVMe or even cold object storage instead of requiring everything to live in RAM.
Business Model
turbopuffer sells B2B infrastructure SaaS with usage-based pricing tied to the three core cost drivers of its architecture: storage, writes, and queries. Minimum monthly spend starts at $64 for the Launch tier, $256 for Scale, which adds HIPAA BAA, SSO, and audit logs, and $4,096-plus for Enterprise, which adds single-tenancy, BYOC, CMEK, and private networking. There is no free tier and no open-source version.
The pricing model maps to how AI retrieval workloads behave. Most namespaces are cold most of the time: a user's codebase or workspace sits dormant until they open it, so turbopuffer charges near-zero for inactive data sitting in object storage and charges only when data is queried. That differs from in-memory vector databases, which charge for RAM residency whether or not data is being accessed. turbopuffer's effective storage cost is roughly $0.02 per GB versus $3,600 per TB per month for legacy RAM-plus-SSD architectures, a 70x difference that underpins customer cost reduction.
Namespace pinning, introduced in April 2026, added a second pricing dimension. Customers with namespaces that receive sustained high query volume, above roughly ten queries per second, can pin those namespaces to reserved hot capacity and pay in GB-hours rather than per-query bytes. This allows turbopuffer to serve both bursty long-tail workloads, serverless and pay-per-query, and high-throughput steady-state workloads, pinned and reserved capacity, within the same product, without requiring customers to switch vendors as usage changes.
Go-to-market has been primarily developer inbound through technical blog posts, podcast appearances, and a customer reference network where Cursor and Notion generate downstream inbound from other AI companies evaluating the same architecture. The first dedicated GTM and growth hires came in 2025, meaning turbopuffer reached $75M in ARR with little formal sales capacity. Enterprise deals, including Anthropic's BYOC deployment and Bridgewater's financial data workloads, are handled through a Slack-heavy, engineer-to-customer support model that the minimum spend tiers are designed to sustain.
The land-and-expand dynamic is built into the pricing model rather than driven by sales. Because pricing scales with usage, revenue rises as customers index more data, add more namespaces, or increase query volume. Notion's removal of per-user AI charges after switching to turbopuffer is the clearest example: cheaper retrieval changed the product decision, causing Notion to index and query more data, which expanded its turbopuffer spend.
Competition
The retrieval infrastructure market has converged rapidly from distinct categories, vector databases, full-text search engines, and relational databases with vector extensions, into a single contested layer where major players now offer hybrid search, object storage economics, and multi-tenancy. turbopuffer's differentiation has shifted from architectural novelty to execution quality, cold-data economics, and production validation at scale.
Managed vector databases
Pinecone is the most direct strategic peer and a useful test of turbopuffer's architectural thesis. Pinecone was built in 2019 for uniform-access, always-hot ML workloads, one index per model, all vectors in RAM, optimized for the enterprise recommendation systems Pinecone's founder had run at Yahoo and AWS. When the ChatGPT wave created demand for per-user, per-workspace, and per-codebase indexes where most data is cold most of the time, that architecture became less economic for the new workload shape.
Pinecone launched a serverless product in January 2024 that adopted object storage separation, an acknowledgment of turbopuffer's thesis, but cold start latency on large datasets remains materially higher than turbopuffer's. Pinecone has also moved upmarket toward a knowledge platform model with managed RAG, assistant products, and a marketplace, creating a clearer split: Pinecone as an application-layer knowledge platform, turbopuffer as pure retrieval infrastructure. Pinecone's ARR declined roughly 47% year-over-year through 2025 as the AI-native segment migrated; Notion, which was publicly praising Pinecone in January 2024, had migrated to turbopuffer by October 2024.
Weaviate and Qdrant compete in similar managed vector database territory, but with different tradeoffs. Weaviate pushes a more bundled platform, with built-in vectorizers, Query Agent, and agentic tooling that reduce the application code customers need to write. Qdrant's open-source model and same-engine-everywhere deployment story, self-hosted, private cloud, managed cloud, and edge, makes it more attractive in accounts that prioritize deployment flexibility over serverless simplicity.
Search incumbents
Elasticsearch is the incumbent turbopuffer most directly displaces for hybrid workloads. Linear's migration from Elasticsearch plus pgvector to turbopuffer in 2025, driven by zero-ops simplicity and a 70% cost reduction, is the clearest example of this pattern. Elasticsearch's advantage is distribution: it already sits inside enterprise search budgets, relevance teams, and procurement approvals, and it remains good enough for many workloads even where turbopuffer would be technically better.
Amazon OpenSearch Serverless is the most credible platform threat because it shares several of turbopuffer's architectural traits, decoupled ingest and search, S3-backed storage, and serverless collections, while benefiting from AWS distribution, IAM integration, and Bedrock Knowledge Bases bundling. For customers already standardizing on AWS, OpenSearch Serverless can be adopted through existing accounts and procurement channels without adding a new vendor. turbopuffer's counter is BYOC deployment inside any cloud VPC, which lets AWS customers run turbopuffer inside their own infrastructure rather than choosing between the two.
Embedded and open-source alternatives
pgvector keeps vectors colocated with transactional data inside Postgres, inheriting ACID semantics, joins, and PITR. turbopuffer's own documentation recommends pgvector for workloads under ten million vectors. The switching cost from Postgres to turbopuffer is material because it changes data layout, ingestion, and consistency workflows, so turbopuffer tends to win only when scale or retrieval quality breaks the economics of keeping everything in Postgres.
LanceDB approaches the market from a similar object-storage-first direction but with an open-source, lakehouse-adjacent positioning that appeals to teams that want file-format portability over managed simplicity.
TAM Expansion
turbopuffer's expansion logic is the Jevons paradox applied to retrieval: when search infrastructure becomes 10-30x cheaper, companies do not just replace existing search, they index data they previously could not afford to make searchable at all. Its expansion vectors follow from that shift in economics.
New products
turbopuffer launched as a vector-only database in October 2023 and has since added BM25 full-text search, hybrid search, sparse vector search, multiple vectors per document, fuzzy filtering, namespace branching, and attribute-aware ranking. Those additions expanded the product from vector retrieval into a broader search stack and widened the set of customer workloads it can serve.
The next product frontier is agentic retrieval. Agentic AI sessions fire dozens to hundreds of parallel search queries rather than a single retrieval call, which fits turbopuffer's stateless, horizontally scalable architecture. turbopuffer has already begun reducing per-query pricing for agentic workloads, betting that lower query costs increase volume. The SID-1 collaboration, where researchers used turbopuffer at over 1,000 queries per second for reinforcement learning training, points to a role in AI research workloads for model training as well as inference.
The CEO has also hinted at longer-term directions including OLAP-style aggregate queries, time series, and traces and logging. Those would expand turbopuffer from first-stage retrieval into a broader analytics layer for unstructured data, using the same object-storage-native stateless compute architecture.
Customer base expansion
turbopuffer's current customer base is concentrated in AI-native software companies, but the named customer list already spans multiple verticals: Bridgewater (financial services), TELUS (enterprise telecom), Harvey (legal AI), Ramp (fintech), and Atlassian (enterprise SaaS). Across those categories, the core value proposition, index everything and pay only for what you query, applies to large internal data corpora that were previously too expensive to make fully searchable.
The enterprise internal data opportunity is particularly large. Vercel uses turbopuffer for GTM memory across Gong, Slack, and Salesforce data, a pattern that points toward the broader enterprise search and copilot market historically served by Elasticsearch and Microsoft. PostHog offers a useful comparison: it used a developer-first brand and self-serve onboarding as a wedge against larger analytics platforms, then expanded upmarket as the product matured. turbopuffer appears to be following a similar path, with the developer-first brand as the initial wedge and BYOC plus enterprise compliance features as the upmarket expansion mechanism.
Geographic and deployment expansion
turbopuffer currently offers public regions across AWS and GCP in North America, Europe, Asia-Pacific, and Latin America, with BYOC support on AWS, GCP, and Azure. The absence of public Azure regions is a gap given how many enterprises are standardized on Azure for identity, compliance, and procurement.
BYOC is the most important deployment expansion vector for regulated industries. Anthropic's BYOC deployment, running turbopuffer inside Anthropic's own VPC, shows that the model can work for security-sensitive AI infrastructure buyers. Expanding BYOC coverage and adding native Azure public regions would open financial services, healthcare, and government-adjacent workloads where data residency and network isolation are procurement requirements. The compliance infrastructure is already largely in place: SOC 2 Type 2, HIPAA BAA, CMEK, private networking, and audit logs are all available today.
Risks
Customer concentration: Cursor, Notion, and Anthropic almost certainly represent a disproportionate share of turbopuffer's $100M ARR, so a single large customer churning, renegotiating pricing, or building in-house retrieval infrastructure could create a material revenue shock that the current headcount and cost structure would struggle to absorb quickly.
Architectural commoditization: Pinecone, Amazon OpenSearch Serverless, and LanceDB have all adopted object-storage-native architectures that narrow turbopuffer's core technical differentiation, shifting competitive selection toward distribution, procurement relationships, and platform bundling, where hyperscalers and incumbents have structural advantages turbopuffer cannot easily replicate.
Cold-latency ceiling: turbopuffer's cost advantage is inseparable from its object-storage-first design, which imposes a latency floor on cold namespace queries and makes tail latency less predictable under agentic workloads with aggressive parallel fan-out, a structural constraint that becomes more exposed as the market shifts toward real-time agent sessions that cannot tolerate the hundreds-of-milliseconds cold-start times that are acceptable in archival or batch retrieval contexts.
DISCLAIMERS
This report is for information purposes only and is not to be used or considered as an offer or the solicitation of an offer to sell or to buy or subscribe for securities or other financial instruments. Nothing in this report constitutes investment, legal, accounting or tax advice or a representation that any investment or strategy is suitable or appropriate to your individual circumstances or otherwise constitutes a personal trade recommendation to you.
This research report has been prepared solely by Sacra and should not be considered a product of any person or entity that makes such report available, if any.
Information and opinions presented in the sections of the report were obtained or derived from sources Sacra believes are reliable, but Sacra makes no representation as to their accuracy or completeness. Past performance should not be taken as an indication or guarantee of future performance, and no representation or warranty, express or implied, is made regarding future performance. Information, opinions and estimates contained in this report reflect a determination at its original date of publication by Sacra and are subject to change without notice.
Sacra accepts no liability for loss arising from the use of the material presented in this report, except that this exclusion of liability does not apply to the extent that liability arises under specific statutes or regulations applicable to Sacra. Sacra may have issued, and may in the future issue, other reports that are inconsistent with, and reach different conclusions from, the information presented in this report. Those reports reflect different assumptions, views and analytical methods of the analysts who prepared them and Sacra is under no obligation to ensure that such other reports are brought to the attention of any recipient of this report.
All rights reserved. All material presented in this report, unless specifically indicated otherwise is under copyright to Sacra. Sacra reserves any and all intellectual property rights in the report. All trademarks, service marks and logos used in this report are trademarks or service marks or registered trademarks or service marks of Sacra. Any modification, copying, displaying, distributing, transmitting, publishing, licensing, creating derivative works from, or selling any report is strictly prohibited. None of the material, nor its content, nor any copy of it, may be altered in any way, transmitted to, copied or distributed to any other party, without the prior express written permission of Sacra. Any unauthorized duplication, redistribution or disclosure of this report will result in prosecution.