Split LLM Gateway Control Planes
3T+ token/day Coinbase of the inference economy
This competition shows the LLM gateway layer is real infrastructure, but it is splitting into two very different products. OpenRouter sells a managed traffic layer, where developers swap one endpoint for another and get routing, failover, billing, and usage analytics in one place. LiteLLM and Helicone came from the self hosted side, where teams want the same control plane inside their own stack, often to keep keys, logs, and prompts off a third party system.
-
OpenRouter competes on handling the full request path. It takes about a 5% cut of inference spend, normalizes provider responses, and gives teams one dashboard for spend, routing, and uptime across 400 plus models and 60 plus labs. That is a very different purchase from an open source proxy that a team installs and operates itself.
-
LiteLLM and Helicone show the open source path, but also its limits. Open source gateways are attractive because the core job is API proxying and load balancing, so teams can self host. But the March 24, 2026 LiteLLM supply chain compromise pushed security into the buying decision for any team routing sensitive prompts and keys.
-
The field is broadening beyond pure gateway startups. Vercel made AI Gateway generally available on August 21, 2025, bundling model routing into a frontend developer platform. Merge extends the same universal API idea from SaaS integrations into AI middleware. Helicone joining Mintlify in March 2026 also shows how standalone observability and gateway tools can get absorbed into larger developer workflows.
The next step is consolidation around a few trusted control planes. Smaller teams will keep picking managed gateways because they remove operational work, and larger companies will demand the same routing logic with stronger security, policy controls, and enterprise workflows. The winners will look less like simple proxies and more like the system of record for how AI traffic gets routed, observed, and paid for.