Anthropic Leads Multi-Model Enterprise Stacks
Augusto Marietti, CEO of Kong, on the end of tokenmaxxing
Anthropic’s lead in Kong’s traffic means enterprise AI has already moved past benchmark shopping and into default production choice. In practice, teams are not picking one model for every task, they are standardizing on Claude for a large share of real work, especially code, long documents, and internal copilots, then using a gateway to send simpler prompts to cheaper models or open source options when quality is less critical.
-
Kong sits in the control layer where enterprises can see actual model calls, not survey answers. The same interview says most enterprises run five or six LLMs, usually including one open source model, which makes Anthropic’s lead notable because it is winning inside a multi model stack rather than by exclusivity.
-
Anthropic’s product shape fits enterprise workloads unusually well. Claude has been pushed around long context and coding workflows, while Claude Code and MCP support make it easier to plug Claude into internal developer and agent systems. That helps explain why usage share can outrun pure headline model rankings.
-
The economic consequence is more gateway routing, not winner take all lock in. Kong describes enterprises using semantic routing, caching, prompt compression, and throttling so expensive frontier models handle only the hard requests. OpenAI’s own prompt caching docs show why this matters, because repeated context can cut latency and input cost materially.
The next phase is a split market where Anthropic remains the premium default for high value enterprise tasks, while gateways decide when a request is good enough for a cheaper commercial or open source model. That pushes competition away from raw benchmark bragging and toward owning the production workflow, the developer surface, and the control plane around model usage.