Together AI becomes hybrid GPU operator

Together AI

historically sourced from providers like CoreWeave and Lambda but increasingly from Together's own data centers

Analyzed 5 sources

This shift turns Together from a software markup on rented GPUs into a hybrid operator that can keep more of each dollar of compute revenue. Early on, Together won by wrapping CoreWeave and Lambda capacity in a simpler product for startups, with per token APIs and rented clusters instead of long reservations. Owning data center capacity in Maryland, Sweden, and planned Memphis lets it capture more of the infrastructure margin that pure resellers leave with suppliers, while keeping the developer workflow that made it useful in the first place.

1 sacra 2 sacra 3 sacra 4 sacra 5 sacra

Historically, the split was clear. CoreWeave and Lambda buy GPUs, lock customers into reserved clusters, and earn much higher infrastructure margins. Together rented that capacity, repackaged it with open model hosting and training tools, and ran at roughly 45% gross margin instead of the ~85% profile estimated for CoreWeave.

1 sacra 2 sacra 4 sacra 5 sacra
For customers, this means Together can look more like one vendor across both inference and training. A startup can hit an API for open models, then rent dedicated H100 or Blackwell clusters for fine tuning or serving, without separately negotiating with a raw GPU cloud. That convenience is the product, not just the silicon.

1 sacra 2 sacra 3 sacra
The comparison to Lambda and CoreWeave is useful because they still win where buyers want bespoke clusters, lowest per GPU pricing, and direct control over hardware. Together is moving toward partial ownership because the closer it gets to the metal, the more margin and performance it can keep without giving up its software layer.

1 sacra 2 sacra 5 sacra

The next phase is a convergence of GPU cloud and AI platform. As Together adds owned capacity and larger power footprints, it can serve bigger enterprise and sovereign workloads while defending API pricing with better latency, reliability, and data residency. That makes it less dependent on upstream suppliers and pushes it closer to being a full stack AI cloud, not just a reseller.

1 sacra 2 sacra 3 sacra 4 sacra