Home  >  Companies  >  Kling
Kling
Generative multimodal AI platform for creating video, images, and audio from text and multimodal prompts

Revenue

$500.00M

2026

Funding

$17.00M

2024

Details
Headquarters
Beijing
CEO
Cheng Yixiao
Website
Milestones
FOUNDING YEAR
2025
Listed In

Revenue

Sacra estimates that Kling AI hit $500M in annualized revenue in May 2026, up from $150M at the end of 2025.

Revenue comes from three streams: consumer subscriptions, per-generation credit purchases, and enterprise and API contracts. On the consumer side, monthly subscription bookings exceeded RMB 100M in both April and May 2025, indicating that subscriptions alone had reached meaningful scale before enterprise revenue was added.

Blended revenue per creator is low, roughly $4 annually across 60M+ registered creators, which suggests a large free-user base and monetization concentrated in a smaller paying cohort of prosumers and enterprise clients. The 30,000+ enterprise clients disclosed as of December 2025 appear to be the highest-ARPU segment and the primary driver of ARR growth.

Valuation & Funding

As of May 19, 2026, Kling AI has not completed any publicly disclosed external funding round as a standalone entity. On May 12, 2026, Kuaishou Technology, its parent, disclosed that it was assessing a potential restructuring of Kling AI that may involve external financing, and described the proposal as preliminary with no definitive agreements signed.

Media reports at the time said Kuaishou was seeking a $2 billion pre-IPO round at a $20 billion valuation, with Tencent Holdings in discussions as a prospective investor. Neither the round size nor the valuation has been confirmed as closed.

Kuaishou has provided all capital supporting Kling AI to date, describing the product as internally developed and funding it through its own balance sheet and infrastructure investment.

Product

Kling AI is a multimodal AI creative studio centered on video generation. In the web or mobile app, users choose among text-to-video, image-to-video, avatar, restyle, or the Omni interface, upload reference material, and generate a clip, image, or audio asset in the same workflow.

Its core video product, Video 3.0, supports clips up to 15 seconds at up to 4K resolution, with native audio generated alongside the visuals. In multi-shot mode, users can let the model plan cuts automatically or set each shot's duration and description manually, which makes the product closer to directing a short scene than animating a single prompt.

A key product distinction is consistency across generations. Users can save characters, props, scenes, and voice recordings as reusable elements in an internal library, then reuse those elements in later outputs. That allows a brand to keep a product, mascot, or human model consistent across ad variants rather than having the subject drift between shots. Video 3.0 supports binding up to three additional elements per generation alongside up to seven character references.

Avatar 2.0 is built for longer-form presenter-style content up to five minutes. Users upload a portrait image, add speech content or audio, and receive a talking-head video. The product sits alongside cinematic generation rather than replacing it, with one workflow for short narrative scenes and another for spokesperson or explainer content.

The Omni interface accepts mixed inputs, including images, video clips, text, and saved elements, and interprets them as a single prompt. That makes iterative production more practical: a creator can upload multiple angles of a character, a background reference, and a voice sample, then generate a new scene that preserves those inputs without rebuilding the prompt each time.

Business Model

Kling AI is a vertically integrated generative media platform serving individual creators and enterprise clients, with a hybrid monetization model that combines subscriptions with metered credit consumption. Consumer plans run from $10 per month at the Standard tier to $37 per month for Pro and $179.99 per month for Ultra, with one-off credit packs for users that need more generation capacity than their plan includes.

Credits are the core economic unit. Each generation task carries a per-second credit cost, so heavier workloads cost more: a 1080p video with native audio costs 12 credits per second, while a 720p clip without audio costs 6. This pricing structure passes compute variability through to users rather than absorbing it inside a flat fee, and creates an upgrade path as creators move from casual experimentation to production-grade output.

The enterprise and API channel complements the consumer product and targets agencies, e-commerce teams, animation studios, and software platforms that want to embed video generation into their own workflows. Enterprise clients tend to consume at higher volumes and buy on contract rather than through the App Store, producing higher ARPU than the blended consumer figure.

Kuaishou's broader advertising and commerce ecosystem provides a built-in demand channel that standalone AI video companies do not have. Kuaishou's own merchants and advertisers were already spending more than RMB 30M per day on AI-generated marketing materials in late 2024, giving Kling AI a commercial distribution channel and direct feedback loops on which ad creative performs.

Competition

The AI video market has split between foundation model labs pushing generation quality and workflow platforms packaging that quality into usable creative pipelines. Kling AI is one of the few players trying to own both layers at once.

Vertically integrated model labs

Runway is the clearest direct comparable: a video-specific model lab with a creative workflow layer on top, targeting professional teams and agencies. Runway's recent move to expose third-party models, including Google's Veo 3, through its own API signals a bet on workflow ownership over model exclusivity, a different posture than Kling AI's all-in-one model thesis.

Google Veo, available through the Flow filmmaker interface and Vertex AI for enterprise developers, is the clearest enterprise threat. Google prices Veo 3 video-plus-audio generation at $0.75 per second on Vertex AI and can bundle it into existing cloud procurement relationships where Kling AI has no incumbent position.

ByteDance's Seedance is the most direct product-level rival. Its unified multimodal audio-video architecture, 15-second multi-shot output, and reference and editing support mirror Kling AI's core thesis closely, and ByteDance can distribute through CapCut and TikTok creator ecosystems that already have global scale.

Consumer and creator-focused tools

Pika and Luma compete at the lighter end of the creator market with fast, accessible generation tools and developer-friendly APIs. Neither matches Kling AI's depth on multi-shot control, element consistency, or native audio, but both can capture social content, prototyping, and app-integration use cases where full multimodal production is unnecessary.

MiniMax's Hailuo is a fast-following alternative that competes on cost-performance and third-party platform distribution. Its partnership with VEED for the Hailuo 2.3 launch shows how Chinese model labs can reach Western creators through embedded integrations rather than direct consumer acquisition, a distribution strategy that sidesteps Kling AI's app-level brand advantage.

Higgsfield represents a different architectural bet: a model-agnostic orchestration layer that can swap in whichever underlying video model is best at a given moment. Where Kling AI is a vertically integrated model-plus-product stack, Higgsfield treats models as interchangeable. Both were tracking around $230–240M ARR in early 2026, making them the two clearest revenue leaders in AI video for marketers.

Model aggregators and commoditization pressure

The structural risk is that aggregators capture the developer relationship while model owners compete on price. Kling AI's response is to make its product surface sticky enough that the platform, rather than the underlying model, is what users are buying.

TAM Expansion

Kling AI's expansion logic runs in three directions at once: deeper into professional creative workflows, broader across enterprise verticals, and wider geographically through multilingual capabilities that make localization a product feature.

New products and workflow depth

Kling AI's push into native 4K output, automated storyboarding, and shot-level direction moves it from single-asset generation toward software that can replace portions of a production pipeline. As it absorbs more steps, including scripting, pre-visualization, motion reference, dubbing, and asset management, its TAM starts to resemble the combined budgets of creative software, production services, and performance-content tooling rather than the narrower AI video category.

Avatar 2.0's five-minute content scenes open a separate lane in spokesperson video, sales enablement, and training content, outside the short cinematic generation market. Virtual try-on for e-commerce and product demo generation for merchants add product surfaces that shift demand from one-time generation to recurring business-process usage.

Customer base expansion

The most valuable expansion path is into marketing organizations and commerce sellers that need high-volume, localized, brand-consistent creative at a fraction of traditional production cost. Kuaishou's existing merchant and advertiser relationships provide a warm channel into that buyer profile, making Kling AI the visual generation layer inside commercial workflows rather than a standalone app. Platforms like OpenArt and Higgsfield already use Kling AI as an underlying generation primitive, and deeper embedded distribution relationships, alongside direct API sales to agencies and software builders, create a more durable revenue base than consumer subscriptions alone.

Geographic expansion

Kling AI's multilingual native audio and lip-sync capabilities across Chinese, English, Japanese, Korean, and Spanish make geographic expansion a product feature rather than only a sales motion. A brand can generate one visual asset and localize it across languages and regional accents in the same workflow, which is particularly valuable for cross-border e-commerce and global performance marketing.

Creator entries from 122 countries through Kling AI's NEXTGEN program and a May 2026 App Store ranking of number one overall in 42 countries indicate that international demand is already pulling the product outward. The viral template dynamic, where a trend spreads on social platforms, creators try to recreate it, and new users download the app, is a low-cost acquisition engine that works across markets without requiring local sales infrastructure.

The deepest international opportunity is in mobile-first markets where short-form video, creator monetization, and localized commerce content converge. Kuaishou's short-video background gives Kling AI experience at that intersection that pure model labs and Western creative tools are still developing.

Risks

Model commoditization: As Google Veo, ByteDance Seedance, MiniMax Hailuo, and open-weight alternatives like Hunyuan Video converge on the same core buying criteria, multi-shot control, native audio, reference consistency, 4K output, the differentiation that justified Kling AI's early pricing premium erodes, shifting competition toward distribution scale and workflow lock-in rather than generation quality.

Compute margin pressure: Kling AI's push into native 4K, 15-second multi-shot clips, simultaneous audio-visual generation, and motion control increases inference intensity, and if market pricing falls faster than GPU costs decline, the credit-based model that currently protects unit economics may not be sufficient to sustain margin expansion while the company continues feature development.

Regulatory and content liability: Operating a platform that generates realistic talking people, branded visuals, and multilingual media across web and mobile in dozens of jurisdictions creates compounding exposure to AI-generated media regulations, copyright enforcement, identity and likeness rights, and cross-border content compliance requirements that can slow feature rollout, raise operating costs, and constrain high-value use cases like synthetic spokespeople and commercial deepfakes.

News

DISCLAIMERS

This report is for information purposes only and is not to be used or considered as an offer or the solicitation of an offer to sell or to buy or subscribe for securities or other financial instruments. Nothing in this report constitutes investment, legal, accounting or tax advice or a representation that any investment or strategy is suitable or appropriate to your individual circumstances or otherwise constitutes a personal trade recommendation to you.

This research report has been prepared solely by Sacra and should not be considered a product of any person or entity that makes such report available, if any.

Information and opinions presented in the sections of the report were obtained or derived from sources Sacra believes are reliable, but Sacra makes no representation as to their accuracy or completeness. Past performance should not be taken as an indication or guarantee of future performance, and no representation or warranty, express or implied, is made regarding future performance. Information, opinions and estimates contained in this report reflect a determination at its original date of publication by Sacra and are subject to change without notice.

Sacra accepts no liability for loss arising from the use of the material presented in this report, except that this exclusion of liability does not apply to the extent that liability arises under specific statutes or regulations applicable to Sacra. Sacra may have issued, and may in the future issue, other reports that are inconsistent with, and reach different conclusions from, the information presented in this report. Those reports reflect different assumptions, views and analytical methods of the analysts who prepared them and Sacra is under no obligation to ensure that such other reports are brought to the attention of any recipient of this report.

All rights reserved. All material presented in this report, unless specifically indicated otherwise is under copyright to Sacra. Sacra reserves any and all intellectual property rights in the report. All trademarks, service marks and logos used in this report are trademarks or service marks or registered trademarks or service marks of Sacra. Any modification, copying, displaying, distributing, transmitting, publishing, licensing, creating derivative works from, or selling any report is strictly prohibited. None of the material, nor its content, nor any copy of it, may be altered in any way, transmitted to, copied or distributed to any other party, without the prior express written permission of Sacra. Any unauthorized duplication, redistribution or disclosure of this report will result in prosecution.