Invisible at Risk of Automation

Diving deeper into

Invisible

Company Report
If techniques like simulated feedback or AI-to-AI reinforcement learning advance, companies like Invisible could see their cutting-edge services become less critical
Analyzed 7 sources

The real risk is that Invisible can be automated from both sides at once, by better models doing more of the work, and by better evaluation systems reducing how much human judgment labs need to buy. Invisible grew fast by supplying raters for RLHF, reaching an estimated $134M of revenue in 2024 after contracts with Microsoft, Cohere, AI21, Mistral, and Perplexity. But the same labs are also adopting rule based rewards, automated graders, and synthetic data methods that replace repetitive human review first.

  • Invisible’s core workflow is concrete labor arbitration. Labs send batches of outputs, raters rank or score them, Invisible routes the tasks through its internal workflow software, bills roughly $30 to $45 per hour, and pays workers roughly $15 to $20 per hour. That model works best when human taste or expertise is still the bottleneck.
  • The strongest comparable is Mercor, which leans harder into sourcing scarce experts like doctors, lawyers, and PhDs for reasoning tasks. That shows where the market moves if basic feedback gets automated, away from generic raters and toward expensive specialists whose judgment is harder to simulate.
  • Model labs are already building exactly the tools that shrink outsourced feedback demand. OpenAI has described rule based rewards as a way to align models without extensive recurring human data collection, and has used them as part of its safety stack since GPT-4. Anthropic and others are also expanding automated evaluation systems that act as model graders.

This pushes Invisible toward the parts of the market where mistakes are costly and every decision needs an audit trail. The winning version of the company looks less like a giant labeling vendor and more like a regulated AI operations layer for finance, healthcare, and defense, where humans are kept because customers need accountability, not because the models are still too weak.