Recall.ai halves bot CPU usage
Recall.ai
This kind of optimization is the difference between a good API business and a bad infrastructure business. Recall.ai sells a product where every extra meeting minute creates real compute cost, so halving per bot CPU usage directly widens gross margin and lowers the price floor the company can profitably offer. In practice, the team found that moving media out of Chromium over WebSockets was burning far more CPU than expected, then replaced it with a custom shared memory ring buffer and cut bot CPU by up to 50%.
-
Recall.ai bots originally needed about 4 CPU cores to run smoothly. That matters because the product is literally a fleet of cloud meeting bots joining Zoom, Meet, and Teams calls, not a lightweight SaaS app serving database reads. Infrastructure efficiency is therefore product strategy, not back office tuning.
-
The bottleneck was internal transport, not the obvious video codecs. WebSockets added framing, masking, fragmentation, and extra copying overhead, so the expensive part was moving raw meeting audio and video between processes. The replacement was a custom Chromium fork plus shared memory transport designed for zero copy reads.
-
This is a common pattern in video infrastructure. Mux and Wistia both describe video as unusually heavy on bandwidth, compute, and storage, with pricing pressure pushing vendors to own critical infrastructure and squeeze out inefficiency. The companies that win are the ones that turn low level systems work into lower COGS and better developer economics.
The next leg of scale comes from repeating this play across the stack. As Recall.ai expands from passive meeting capture into desktop recording, mobile recording, and two way AI agents in live calls, the companies that can keep cost per minute falling while usage explodes will have the clearest path to durable margins and category leadership.