Invisible vs Mercor

TL;DR: With the rise of frontier AI labs needing human feedback to fix hallucinations and align model behavior, Invisible shifted gears from executive VA services to become a RLHF (reinforcement learning with human feedback) provider for companies like Microsoft, Cohere, and Mistral. Sacra estimates Invisible generated $134M in revenue for 2024, up 123% YoY. For more, read our full report and dataset on Invisible.

Key points via Sacra AI:
- Invisible launched in 2015 as a “better VA service” that let executives delegate high-touch knowledge work to offshore assistants—in 2022, as OpenAI, Anthropic, and others began deploying frontier LLMs at scale, it shifted gears to supplying the labor needed to fix hallucinations and align model behavior through reinforcement learning with human feedback (RLHF). Invisible works by routing model outputs through trained raters who score completions, rank outputs, and annotate reasoning steps—charging model labs $30–$45/hour for this work while paying raters $15–$20/hour—and splitting each task into subtasks that are assigned via its internal workflow platform to a 3,000+ person global workforce.
- As labs have shifted from next-token prediction to reasoning, demand for expert-annotated data has surged—after signing multi-year contracts with Microsoft, Cohere, AI21, Mistral, and Perplexity, Sacra estimates Invisible Technologies generated $134M in revenue in 2024, up 123% YoY from $60M in 2023. Compare to AI data labeling incumbent Scale AI at $1.5B ARR (up 97% YoY), valued at $25B for a 16.7x multiple, and AI-native upstart Mercor at a $50M revenue run rate as of the end of 2024 (up 4,900% from 2023), valued at $2B for a 40x multiple.
- Invisible is now positioning itself to win where human-in-the-loop is mandatory for compliance—in regulated industries like healthcare, finance, and defense—by packaging vertical-specific RLHF products and embedding itself as enterprises’ full-stack AI training partner. With GPT-4o and Claude 3.5 now matching human raters on 85% of tasks (while being 5% of the cost and 20x as fast), Invisible, Scale AI, and Mercor, and others must stay ahead of the self-cannibalization feedback loop in training frontier models by drilling down on use cases that demand auditability, traceability, and human accountability.
For more, check out this other research from our platform:
- Mercor (dataset)
- Scale AI (dataset)
- Scale at $760M ARR
- Scale: the $290M/year Mechanical Turk of machine learning
- Contractor Payroll: The $1.4T Market to Build the Cash App for the Global Labor Market
- Wingspan's 992x growth in contractor payroll
- Ved Sinha, Former VP of Product at Upwork, on gig marketplaces
- Samiur Rahman, CEO of Heyday, on building a production-grade AI stack
- Geoff Charles, VP of Product at Ramp, on Ramp's AI flywheel
- Mike Knoop, co-founder of Zapier, on Zapier's LLM-powered future
- OpenAI (dataset)
- Anthropic (dataset)
- Cursor (dataset)