Inference Costs Outweigh Training

Diving deeper into

Chris Lu, co-founder of Copy.ai, on the future of generative AI

Interview
Training models is actually really cheap now.
Analyzed 4 sources

Cheap training shifts the bottleneck in AI apps from building models to paying for usage at scale. For Copy.ai, the expensive part is every live generation request that hits a model in production, not the periodic job of fine tuning one. That matters because a company with strong workflow data can keep training smaller task specific models cheaply, then save money or improve speed by routing everyday jobs away from the most expensive frontier APIs.

  • Copy.ai described people as its biggest cost, OpenAI as the second, and training as a small line item versus production API spend. In plain terms, once a model is trained, the real bill comes from thousands or millions of users pressing generate every day.
  • Copy.ai had already built 20 to 30 fine tuned models by late 2022, using signals like copy, save, and rewrite behavior as training data. That shows the edge is not owning one giant model, it is collecting feedback from real user workflows and turning it into many narrow models that do one job well.
  • This is why AI writing tools moved upmarket. ChatGPT compressed the value of generic text generation, so companies like Copy.ai and Jasper had to embed AI into sales and marketing workflows where custom data, approvals, CRM actions, and multi step execution matter more than raw model size.

Going forward, the winners in application layer AI will look less like labs training one flagship model and more like operators running a portfolio of models by task. As base model prices keep falling and fine tuning gets easier, advantage compounds around proprietary workflow data, smart routing, and owning the business process where inference gets spent.