Sakana Multi Model Orchestration

Diving deeper into

Sakana AI

Company Report
switch between different AI providers like ChatGPT, Gemini, or DeepSeek based on which performs best for specific sub-problems.
Analyzed 4 sources

This turns model access into a control plane business, not a single model business. The important move is deciding which model should handle each step of a task, then stitching the best partial answers together. In practice that means a hard problem can be split into branches, scored as they evolve, and reassigned to the model that is strongest for coding, math, search, or language generation at that moment.

  • AB-MCTS applies Monte Carlo Tree Search at inference time, so the system can try multiple reasoning paths, keep expanding the promising ones, and drop weak ones. Sakana also describes a multi LLM setup where model choice itself is part of the search, using separate probability models and Thompson sampling to pick among LLMs.
  • That is different from a normal model router, which usually sends the whole prompt to one model based on cost or latency rules. Here the routing can happen inside one task, so a math heavy branch could go to one provider while a synthesis step goes to another, which raises answer quality on complex workloads.
  • The business implication is that Sakana can sell managed inference on top of third party models, even when customers bring their own providers. That makes it closer to an orchestration layer, where value comes from better task decomposition, model selection, and result ranking, rather than from owning the underlying frontier model itself.

This is heading toward a market where enterprises buy best answer per dollar, not loyalty to one model vendor. If Sakana keeps improving the search and routing layer, it can sit above OpenAI, Google, DeepSeek, and open models in the stack, and capture spend wherever workloads are complex enough that smart orchestration beats sending every request to one expensive model.