Mercor versus Scale and Surge
Mercor
This rivalry shows Mercor is not really competing with staffing firms, it is competing with AI infrastructure vendors that happen to own labor networks. Scale and Surge sell the full stack, they recruit and vet workers, route tasks through software, measure label quality in dashboards, and take on large managed projects for frontier labs. Mercor overlaps most where labs need scarce experts, but the buyers often want one vendor that can supply both expert judgment and annotation operations.
-
Scale is the broadest incumbent. It grew from self driving car labeling into LLM training, reached $1.5B ARR by end of 2024, and bundles human labor with software products like Rapid, Nucleus, Validate, and Launch. That makes it hard to attack on a single workflow because customers can buy labeling, evaluation, and deployment support together.
-
Surge is closer to Mercor on quality positioning, but with more operational depth. It generated an estimated $1.2B in 2024 revenue from about 12 frontier labs, uses about 50,000 expert contractors, and lets model teams specify narrow worker profiles, then monitors gold standard accuracy, annotator agreement, and worker trust scores in real time.
-
The market is shifting from cheap crowd work to credentialed specialists. Mercor built a network of about 300,000 vetted experts and monetizes access plus recruiting fees, while Scale launched Expert Match and Surge built premium RLHF pipelines. The real contest is who becomes the default broker for hard to source human judgment inside model training loops.
Going forward, the winners in this category will look less like marketplaces and more like control planes for human input. As model labs demand audit trails, benchmark data, red teaming, and specialist evaluation, Mercor will keep moving toward software and managed workflows, while Scale and Surge push deeper into expert sourcing. The lines between recruiting, annotation, and evaluation will keep collapsing.