Together AI mainly a compute reseller

Diving deeper into

Together AI: the $44M/year Vercel of generative AI

Document
90% of revenue coming from sales of bundled on-demand GPU compute and training
Analyzed 7 sources

This revenue mix shows that Together AI was still fundamentally a compute reseller with a smoother developer wrapper, not yet a software heavy AI platform. Most of the money came from customers buying raw model runtime and training capacity through an easier interface, where usage was billed in API style units instead of reserved GPUs by the hour. That drove fast adoption with startups, but it also kept gross margins closer to infrastructure than software.

  • Together AI sat one layer above GPU clouds like CoreWeave and Lambda Labs. Those companies sold long lived GPU access and data center capacity, while Together packaged that capacity into on demand inference and training endpoints that absorbed idle time risk for smaller customers.
  • That packaging explains the margin gap. CoreWeave scaled into much higher revenue with infrastructure economics and production cloud tooling, while Together at $44M annualized revenue was still getting about 90% of sales from bundled compute and training, with gross margin around 45%.
  • The comparable path is toward higher value software on top of compute. Later research shows Together expanding as an API and compute layer for training and deployment of open source models, while adjacent companies like Fireworks and Fal.ai also compete by turning GPU time into easier model serving products.

The next phase is moving more revenue from resold GPU time into higher margin orchestration, routing, optimization, and managed model services. If Together keeps owning the developer workflow while compute prices keep falling underneath it, the business can compound like a software layer built on increasingly commoditized infrastructure.