Search infrastructure as standalone category
Ex-employee at Exa on building search infrastructure for AI data pipelines
This is becoming a real software layer because AI systems need raw, high volume, fresh web retrieval as an input, not just a final answer. In practice that creates distinct products for different jobs. Exa is used when teams need tens of thousands of results and full page text for automated data pipelines, while Parallel is stronger when the job is to produce a synthesized research memo. Buyers are already spending from tens of thousands to about $300,000 per month on those workflows.
-
The demand pattern is different from human search. One Exa user runs 5,000 prompts every day, pulls 50,000 to 100,000 results, and uses the output to update datasets automatically. That is not a search box business, it is machine to machine infrastructure with spend driven by query volume, recall, and extraction quality.
-
The category is already splitting by workflow. Exa wins where customers want maximum result count, semantic retrieval, and full text they can feed into their own models. Parallel wins where the product itself does the planning, browsing, and summary writing for a long research task that can take 10 to 15 minutes.
-
The market can still become large even if the moat is thin. Ecosia routes about 500,000 daily queries to Exa and spends around $300,000 per month, but keeps its architecture portable because quality is converging across vendors. That points to a big category with real budgets, but one where service, pricing, and specialization matter as much as core search quality.
Over the next few years, this layer should expand in two directions at once. Horizontal web search APIs will grow with agent traffic, and higher value products will add domain specific corpuses, better extraction, and deeper orchestration. The likely outcome is a standalone category with a few scaled infrastructure providers, plus specialized tools built for finance, legal, medical, and enterprise knowledge work.