Human Data Vendors as Infrastructure

Diving deeper into

Jemma White, COO of Prolific, on why humans ensure AI safety

Interview
they still rely on external services like Prolific, Handshake, and Surge
Analyzed 4 sources

External human data vendors are becoming permanent infrastructure for frontier labs, not overflow staffing. Internal annotator teams are useful for repetitive, always on workflows, but labs still go outside when they need a second read on model behavior, a participant group they do not already have, or a broader cross section of people by language, culture, credentials, or personality traits. That keeps platforms like Prolific, Handshake, and Surge in the stack even as labs build in house capacity.

  • Prolific’s edge is breadth and profiling. It has about 200,000 active participants, people in 40 plus countries, fluency in 80 plus languages, and detailed filters across behavior, credentials, experience, and personality, which makes it useful for red teaming, safety evals, and cultural nuance work that an internal pool may not cover.
  • Handshake approaches the market from the opposite direction. It started with a university recruiting network, then turned its base of students, graduates, postdocs, and PhDs into an expert labeling supply pool. That let Handshake AI ramp to an estimated $80M annualized revenue by August 2025, showing how valuable credentialed specialist supply has become.
  • The work is also splitting by job type. Some vendors win on broad human diversity and fast self serve sampling, while others win on narrow expert pools or managed service delivery. That is why labs use multiple vendors at once, one for general validation and multilingual coverage, another for PhD level tasks, and another for tightly managed high stakes pipelines.

The next phase pushes even more work outward. As model builders move from raw labeling toward trust, safety, benchmark validation, and market specific behavior testing, vendor mix matters more than vendor consolidation. The winners will be the platforms that can prove participant quality, show who got paid what, and deliver specialized humans on demand with auditability built in.