Modular bridges software for custom silicon
Modular
The real opening for Modular is that many edge chip companies are strong at designing silicon and weak at making it usable by AI developers. A car chip vendor or mobile SoC maker may have fast hardware, but if developers cannot take a PyTorch or ONNX model, compile it, tune kernels, and ship updates, the chip is hard to adopt. Modular sits in that missing software layer with Mojo and MAX, which retarget models across CPUs, GPUs, and custom accelerators from one codebase.
-
The incumbent pattern is clear. NVIDIA ships CUDA, TensorRT, and Triton as a full stack around its chips. AWS does the same with Neuron for Trainium and Inferentia. That is why smaller silicon vendors are exposed, they often have hardware but not a mature compiler, runtime, and kernel toolchain.
-
Mobile and edge already show what a partial solution looks like. Arm pushes KleidiAI through ONNX Runtime so developers can speed up models on Arm CPUs and, where available, mobile accelerators. That helps on standard Arm platforms, but custom ASIC vendors still need someone to bridge their specific hardware into common AI workflows.
-
Modular is trying to sell exactly that bridge. Mojo compiles Python like code into low level kernels, MAX packages models from TorchScript, ONNX, or Mojo Graph, and the same container can move from a laptop CPU to data center GPUs or future accelerators. That makes Modular valuable to chip makers that want software support without building a full developer ecosystem themselves.
If edge AI keeps spreading into cars, phones, robots, and industrial devices, the winning software layer will be the one that turns odd hardware into something developers can use with familiar model formats and APIs. That pushes Modular toward becoming infrastructure for every ambitious chip vendor that cannot afford to build its own CUDA style ecosystem.