Vetted Participants Drive AI Safety
Jemma White, COO of Prolific, on why humans ensure AI safety
This reveals that bringing data labeling in house does not automatically create better models, because the bottleneck has moved from raw labor supply to reliably finding the right humans and catching bad judgments fast. Prolific’s edge is that it keeps a long lived, deeply profiled participant base and lets labs pull targeted groups quickly, while buyers shaken by Scale’s Meta tie up are already redistributing work toward neutral vendors with stronger quality controls.
-
Prolific is built around a vetted pool of about 200,000 active participants with years of performance history, plus a large waitlist, so customers are usually selecting from known people rather than freshly recruited crowdworkers. That makes speed and quality work together, not against each other.
-
Scale’s model was built for huge volumes of labeling through managed labor, and it scaled to an estimated $1.5B ARR by the end of 2024. But after Meta’s June 2025 $14.3B investment, major customers pulled back over independence concerns, showing how vertical integration can weaken a vendor whose value depends on being a trusted neutral supplier.
-
The whole market is moving up the quality ladder. Early RLHF leaned on cheap generalist raters. Then labs paid doctors, lawyers, and PhDs for harder reasoning work. Now the next scarce input is human nuance, cultural fluency, temperament, and safety judgment, which favors platforms that profile people deeply instead of just assembling large labor pools.
Going forward, more model builders will keep some internal labeling capacity, but external vendors will win the highest value work when they can prove cleaner participant identity, better task matching, and faster detection of low quality outputs. That pushes the market toward narrower expert cohorts, more auditability, and a premium on neutral infrastructure rather than captive labor supply.