Self-Service GPU Cloud for Research
Lambda customer at Iambic Therapeutics on GPU infrastructure choices for ML training and inference
The opening is at the workflow layer, not the GPU layer. Cheap H100 hours already exist, but researchers still spend too much time stitching together SSH access, schedulers, storage, containers, and experiment tracking before a training run even starts. The winning product is a cloud where a scientist can reserve a cluster, open JupyterLab or Slurm, mount data, submit a job, and monitor results without needing an internal platform team.
-
The current split is clear. NeoClouds like Lambda and CoreWeave win training workloads with lower prices and willingness to customize interconnect and cluster design, while AWS wins inference because teams can reliably spin up instances, plug into S3 and EKS, and run production services with less operational risk.
-
Lambda is moving toward this middle ground. Its 1-Click Clusters package InfiniBand connected H100 and B200 clusters with managed Kubernetes or Slurm, S3 compatible storage, SSH access, and JupyterLab from the console. That is much closer to a usable research environment than raw rented GPUs.
-
The closest comparable is Together AI, which wraps infrastructure in a higher level developer platform for pre training, model shaping, and inference. The tradeoff is that higher abstraction tends to fit startup and inference use cases better than bespoke training teams that want direct control over hardware, networking, and scheduling.
This category should converge on simpler, self service training clouds with enough opinionated tooling to remove setup work, but not so much abstraction that researchers lose control. If Lambda can keep NeoCloud pricing while making clusters feel as easy to use as a mainstream developer cloud, it can expand beyond infrastructure buyers to become the default home for small and mid sized AI research teams.