NeoClouds versus Hyperscalers for ML
Lambda customer at Iambic Therapeutics on GPU infrastructure choices for ML training and inference
This split matters because GPU cloud buyers are not choosing a generic cloud, they are choosing between cheap raw compute for training and polished infrastructure for always on production workloads. In practice, NeoClouds win when a team needs custom networking, dense multi GPU clusters, and lower hourly prices, while hyperscalers win when the job is tying inference into a broader enterprise stack with mature security, networking, and support.
-
NeoClouds earned share by being willing to customize the cluster itself. In Iambic's case, Lambda and CoreWeave were open to selling InfiniBand based training setups that AWS would not spec at the time, and they did it at a lower price. That is a concrete advantage for training teams that care about interconnect speed more than polished cloud tooling.
-
The economics map cleanly to customer segment. Lambda is pushing lower cost H100 access and simpler self serve workflows for smaller teams, while CoreWeave has moved upmarket into giant reserved capacity deals and large scale HGX clusters for customers committing thousands of GPUs over multiple years.
-
Hyperscalers are not technically weak, they are just optimized differently. Google offers A3 Mega H100 instances with GPUDirect and up to 1,800 Gbps networking, AWS offers P5 and UltraCluster options with EFA, and Oracle offers RDMA cluster networks, but these products sit inside heavier enterprise environments and usually come with more process and higher total cost.
The market is heading toward a clearer barbell. CoreWeave is becoming the high end specialist for massive reserved clusters, hyperscalers remain the default home for enterprise inference, and the open lane is a simpler DigitalOcean for ML that bundles cheap GPUs with much better developer workflow. That is the lane where Lambda can deepen its edge.