ElevenLabs at $90M ARR

Jan-Erik Asplund
View PDF

TL;DR: Sacra estimates that ElevenLabs hit $90M in annual recurring revenue (ARR) in October 2024, up 260% YoY. Now, the Adobe Creative Cloud of AI-generated audio is betting that its premium-priced, human-sounding text-to-speech model will become core infrastructure for the rising wave of AI voice agents. For more, check out our full report and dataset on ElevenLabs.

ElevenLabs at $90M ARR chart 1

Key points via Sacra AI:

  • Before ElevenLabs (2022), voice developers were limited to generic text-to-speech voices like Siri (through SiriKit) and Alexa (Alexa Skills Kit)—ElevenLabs opened the floodgates with their AI model for generating audio across 1,000+ synthetic, human-sounding voices that could speak in 32 different languages. ElevenLabs found product-market fit allowing content creators to upload a 30 second voice sample and generate an instant clone, giving access to the tool for free and charging a monthly subscription fee based on the number of minutes of audio generated (~$0.16 per minute).
  • Like Runway did in AI video, ElevenLabs is moving from foundation model to application layer and building the Adobe Creative Cloud ($13B in yearly revenue) for AI-generated audio, with Sacra estimating ElevenLabs crossed $90M in annual recurring revenue in October 2024, up 350% YoY. Aggressively colonizing long-form audio editing (Descript), video dubbing (Premiere Pro), voiceovers (Adobe Audition), the AI voice marketplace (Envato), and content consumption (Spotify), ElevenLabs has been able to close enterprise deals with big publishers like Time and HarperCollins, raise their effective revenue per API call by 20%, and increase incremental consumption.
  • AI voice agents have strong product-market fit—as a 70% cheaper replacement for humans across support calls, appointment bookings, and restaurant reservations—driving the rise of an AI-native audio stack of tools like Cartesia (Index Ventures) for text-to-speech, Deepgram ($86M raised, Madrona Ventures) for speech-to-text, and Hamming ($750K raised, YC S24) for testing. Companies use multiple different speech-to-text providers, mixing and matching depending on latency, cost (ElevenLabs is 5x as expensive as Cartesia per-minute), and developer experience—all of these tools are now converging on the common feature set of speech-to-text, text-to-speech, and real-time conversational AI.

For more, check out this other research from our platform:

Read more from

ElevenLabs revenue, growth, and valuation

lightningbolt_icon Unlocked Report
Continue Reading

Read more from

Invisible revenue, growth, and valuation

lightningbolt_icon Unlocked Report
Continue Reading

Replit revenue, growth, and valuation

lightningbolt_icon Unlocked Report
Continue Reading