Thinking Machines Needs Massive GPU Capacity
Thinking Machines
This is the core bottleneck that separates AI labs with money from AI labs with staying power. Thinking Machines can hire researchers and raise billions, but turning that into competitive models requires reserved clusters, networking, and long term supply contracts that are increasingly controlled by infrastructure providers building their own AI businesses. The market has already shown that the labs with the strongest compute position are the ones tied most tightly to major cloud and infrastructure partners.
-
The closest comps are GPU clouds like CoreWeave, Lambda, and Fluidstack. They grew fast by packaging scarce NVIDIA capacity into something AI teams could actually use. CoreWeave scaled to $1.9B revenue in 2024 and signed an OpenAI infrastructure deal worth up to $11.9B, which shows how concentrated frontier compute access has become around a few specialist suppliers.
-
A middle layer has emerged on top of those suppliers. Together AI built its business on top of CoreWeave and Lambda, proving startups can rent compute instead of owning data centers. But that also means margins and roadmap are partly downstream of someone else controlling chip supply, pricing, and queue priority.
-
The strongest frontier labs now lock in compute through strategic cloud alliances, not spot purchases. Anthropic made AWS its primary cloud and training partner, with Amazon investing up to $8B, while also expanding TPU usage with Google Cloud. That is the pattern Thinking Machines will likely need to replicate at scale.
Going forward, winning AI labs will look more like infrastructure coalitions than standalone model companies. Thinking Machines will need to convert its funding and talent into durable reserved capacity, likely through multi year partnerships with specialist GPU clouds or a hyperscaler, because compute is becoming the hard gate that determines who can keep training, shipping, and competing at frontier scale.