HeyGen Margins Depend on Infrastructure

Diving deeper into

HeyGen

Company Report
margin depends on negotiating favorable rates from infrastructure providers
Analyzed 8 sources

This is the core weakness of selling AI video on a flat subscription, while buying the underlying intelligence by usage. HeyGen can charge one simple monthly price to creators and teams, but every generated minute, dubbed track, or live avatar session can trigger real supplier costs from voice and realtime model providers. That makes gross margin less about software code alone, and more about procurement, traffic mix, and how efficiently HeyGen routes demand into the cheapest infrastructure that still looks good enough.

  • HeyGen already prices its API in credits tied to usage, 100 credits for $99 per month, with 1 credit covering 5 minutes of LiveAvatar streaming or 1 minute of Video Avatar output. That pricing structure exists because the product has meterable compute underneath it, even if some app plans look unlimited on the surface.
  • In AI avatars, supplier costs tend to fall with scale, but buyers also force prices down. Research on Tavus and Wistia points to a CDN like dynamic where avatar generation becomes cost of goods for larger apps, and infrastructure vendors get played against each other on price as quality converges.
  • The pressure is strongest because HeyGen does not own every layer. OpenAI prices realtime audio separately, and ElevenLabs keeps cutting conversational AI pricing while charging separately for voice services. When upstream vendors lower price, avatar apps gain relief, but their own differentiation also gets harder as the same building blocks become available to rivals.

The next phase is a shift from reseller economics toward selective vertical integration. The winners in AI avatars will keep moving more of the stack in house where it matters, keep external models where they are cheaper or better, and pass lower costs into new use cases like embedded support agents and localized training libraries before standalone avatar generation becomes a thinner margin feature.