Lambda's bespoke clusters cause lock-in
Lambda customer at Iambic Therapeutics on GPU infrastructure choices for ML training and inference
This reveals that Lambda is not selling interchangeable GPU hours, it is selling a semi custom training environment that gets harder to leave once a team has wired its jobs, storage, security, and cluster management into it. At Iambic, that meant reserved HGX style clusters for training, custom Kubernetes support, custom storage setup, and even an air gapped first cluster, all while staying close enough to market pricing that a small hourly discount elsewhere did not justify the migration work.
-
The lock in is operational, not contractual. Iambic had to choose around specific interconnect, cluster architecture, and workflow needs for large synchronous training runs. Lambda and CoreWeave were willing to spec those needs in late 2023, while AWS and Oracle could not deliver the required InfiniBand quality in time.
-
This is where Lambda sits in the market. CoreWeave is scaled around larger enterprise style deployments and production tooling, while Lambda has won with researchers, startups, and growth stage teams that want cheaper reserved capacity and more flexibility in how clusters are set up and managed.
-
The support model matters because training teams often need a provider to help keep unusual setups running. Iambic describes direct engineering help on Kubernetes, informal Slack access, and a collaborative process for changing terms or asking for favors. That kind of service turns infrastructure into a working relationship, not just a rental.
As GPU supply normalizes, the winners in training cloud will be the providers that turn raw hardware into a smoother daily workflow for researchers. Lambda is already moving from bespoke setups toward more standardized one click clusters, and if it can keep enough of that early flexibility while productizing it, that becomes a durable advantage with midsize AI teams.