Training on Lambda and Inference on AWS

Diving deeper into

Lambda customer at Iambic Therapeutics on GPU infrastructure choices for ML training and inference

Interview
These providers will give you much higher prices and want to sign much bigger and longer-term contracts, but they offer a more enterprise, more robust, more fully-fledged experience.
Analyzed 6 sources

The real moat here is not the GPU itself, it is everything wrapped around it when a workload becomes business critical. Hyperscalers charge more because they sell a finished operating environment, not just compute, with mature security reviews, predictable procurement, integrated storage and networking, infrastructure as code, and on demand reliability for production systems. For teams training large models, that bundle often matters less than raw price and cluster flexibility. For inference and customer facing workloads, it matters a lot more.

  • In practice, enterprise means the boring but essential parts already work. At Iambic, training ran on reserved Lambda clusters with custom InfiniBand and Kubernetes help, while inference stayed on AWS because the team valued always available instances, S3 durability, and mature tooling more than lower GPU prices.
  • The price gap can be large enough to outweigh brand comfort. Iambic described NeoCloud pricing as roughly half of hyperscaler pricing on H100 hours, and chose between Lambda and CoreWeave mainly on final per GPU quote once both could meet the required HGX and interconnect spec.
  • The market is segmenting by workflow. CoreWeave has scaled into enterprise sized, production grade GPU infrastructure at much larger revenue scale, while Lambda has stayed more oriented to flexible growth stage customers and researcher friendly workflows. That is why the same company can buy cheap training from Lambda and premium reliability from AWS.

Over time, the gap should narrow from both sides. NeoClouds are adding more packaged cluster products and enterprise controls, while hyperscalers keep improving AI specific networking and reservable capacity. The likely end state is a split market, with raw and customizable GPU clouds winning training first, and the most complete cloud operating environments keeping an edge in inference and regulated production workloads.