Weave and Neutral AI Measurement
Weave
This conflict creates an opening for neutral measurement to become its own product category. When GitHub, GitLab, or Sourcegraph report AI impact inside the same product that sells Copilot, Duo, or Cody, they control the event data, the definitions, and the dashboard. That makes bundled analytics convenient, but it also means the scorekeeper is attached to the revenue stream being judged. A specialist like Weave wins by sitting outside the assistant itself and comparing behavior across tools, repos, and workflows.
-
Platform vendors already bundle AI metrics into their core product. GitLab puts Duo Code Suggestions into its Value Streams Dashboard to show ROI, and Sourcegraph exposes Cody metrics like chats, completions, and acceptance rates inside Sourcegraph Analytics. The measurement layer is built by the same vendor whose AI adoption it is meant to prove.
-
Independent tools are pitching a different level of granularity. Weave measures AI generated code and engineering workflow impact, while Span is positioned around model based AI code detection. By contrast, Jellyfish is described as relying on GitHub telemetry, which can miss usage happening in ChatGPT or other IDE side channels.
-
The market is converging fast. Older engineering intelligence companies like Jellyfish and LinearB are adding AI dashboards and controls, while platform companies fold analytics into existing seats. That pushes specialists to differentiate on cross vendor coverage, code level attribution, and credibility as a neutral system of record for AI spend.
The next step is a split between bundled telemetry and independent governance. Platform suites will keep shipping native dashboards because they are easy to turn on, but the larger AI budget gets, the more enterprises will want a measurement layer that can compare Copilot, Duo, Cody, Cursor, and chat based coding tools using one methodology.