Datacurve Continuous Evaluation Platform
Datacurve
The real upside is not selling more data, it is becoming the system customers use to find model weaknesses every week and then route new training work. Datacurve already starts with private benchmarks, turns failures into bounty tasks on Shipd, and delivers outputs that plug into Ray, Mosaic ML, and internal pipelines. Packaging that evaluation step as a subscription would move Datacurve from episodic vendor to always-on layer in the training loop, with steadier revenue and harder customer replacement.
-
Today the workflow is already close to a recurring product. Datacurve runs benchmarks, identifies weak spots in coding performance, converts them into structured quests for 14,000 plus vetted engineers, then ships standardized datasets and dockerized RL environments back into customer training systems.
-
The closest comparables show why this matters. Surge is pushing evaluation and red-teaming suites as recurring software on top of managed services, while Scale moved beyond labeling into Validate, Nucleus, and other monitoring tools to own more of the ML stack and reduce pure project revenue exposure.
-
A subscription evaluation product also protects against the main risk in custom data work, customer concentration. If a frontier lab relies on Datacurve to continuously score coding models and surface new failure cases, switching away means replacing not just a supplier, but an embedded testing and data generation workflow.
This points toward a market where coding data vendors split in two. Commodity providers sell labor by the project, while stronger players sell an ongoing control layer for evaluation, data generation, and reinforcement learning environments. Datacurve is positioned to move into that second category, which should raise switching costs, widen product scope, and make revenue more durable over time.