Home  >  Companies  >  Synthesized
Synthesized
Enterprise test data infrastructure platform using AI-powered synthetic data generation and autonomous test data agents to streamline software testing processes

Funding

$2.80M

2024

View PDF
Details
Headquarters
London
CEO
Nikolai Liubimov
Website
Milestones
FOUNDING YEAR
2018

Valuation

Synthesized raised a $20 million Series A in September 2025 led by Redalpine Venture Partners. The round included participation from existing investors IQ Capital, Mercia Ventures, and Seedcamp, along with strategic investors UBS Next and Deutsche Bank.

The company previously raised a $2.8 million seed round in March 2020 led by IQ Capital and Mundi Ventures. Deutsche Bank made a strategic investment in September 2022, followed by UBS Next's investment in February 2024, both at undisclosed amounts.

Product

Synthesized provides an AI-powered test data infrastructure platform that generates synthetic databases for software testing and development. Think of it as a smart data factory that takes production databases and creates realistic, privacy-safe copies for developers to use in testing environments.

The platform consists of three integrated components. TDK (Test Data Kit) serves as the core engine that can subset, mask, and generate multi-terabyte relational databases while preserving foreign keys and referential constraints. Governor provides a web interface and API layer where teams create reusable workflows, schedule data refreshes, and integrate into CI/CD pipelines. The SDK offers Python and Spark libraries for data scientists working with Pandas or PySpark dataframes.

A typical workflow starts when a developer writes a YAML configuration describing their data requirements—perhaps a 10GB PostgreSQL database with the same statistical distributions as production but with customer names masked and additional edge cases generated. Synthesized's AI engine analyzes the source data patterns, then generates a completely synthetic dataset that maintains the original's statistical properties and relationships while ensuring no real customer data appears in the output.

The system connects to existing databases via JDBC, processes the transformation rules, and outputs clean datasets to any target environment including PostgreSQL, Oracle, SQL Server, MySQL, Snowflake, or flat files. Teams can spin up disposable test environments on demand or automate data refreshes through GitHub Actions and Kubernetes jobs.

Business Model

Synthesized operates a B2B SaaS model targeting enterprise development and QA teams in regulated industries. The company sells software licenses based on data processing volumes and database sizes, with pricing that scales according to the complexity and frequency of synthetic data generation tasks.

The platform follows a usage-based pricing structure where customers pay for the computational resources required to transform and generate their test datasets. This aligns costs with actual consumption while allowing teams to start small and scale up as they adopt synthetic data across more applications and environments.

Synthesized's go-to-market strategy focuses on direct enterprise sales, particularly in financial services where data privacy regulations create strong demand for synthetic alternatives to production data in testing. The company leverages strategic relationships with customers like UBS and Deutsche Bank to demonstrate proven value in highly regulated environments.

The business benefits from sticky usage patterns as synthetic data generation becomes embedded in development workflows. Once teams integrate TDK agents into their CI/CD pipelines and establish automated data refresh schedules, switching costs increase significantly due to the operational dependencies and custom YAML configurations built around the platform.

Competition

Full-stack DevOps suites

Perforce completed its acquisition of Delphix in 2024, creating a comprehensive DevOps data platform that bundles test data management with version control and CI/CD integration. The combined entity leverages Perforce's existing relationships with Fortune 100 companies to offer integrated masking, versioning, and synthetic generation capabilities.

K2View positions itself as an all-in-one test data management solution with self-service subsetting, masking, and AI-powered generation. The company competes on data virtualization capabilities that accelerate refresh cycles and uses node-based pricing to offer cost advantages over per-usage models.

AI-native synthetic data specialists

Tonic.ai focuses heavily on developer experience and supports structured, semi-structured, and unstructured data types. The company's 2025 acquisition of Fabricate added schema-first generation capabilities and natural language prompts, expanding into greenfield application development and LLM fine-tuning use cases.

MOSTLY AI released an open-source SDK in February 2025, making its core generative technology freely available while monetizing through enterprise services and advanced features. This move pressures commercial pricing across the synthetic data segment by lowering switching costs for potential customers.

Cloud platform integrations

Major cloud providers including AWS, Google Cloud, and Microsoft Azure are embedding synthetic data utilities into their broader ecosystems. These integrations create distribution advantages and bundled pricing pressure that challenges standalone point solutions on both cost and procurement convenience.

TAM Expansion

New products

Synthesized's autonomous test data agents represent a significant expansion beyond traditional batch processing into always-on, event-driven data generation. These agents automatically refresh compliant test databases on every code push through GitHub Actions or Jenkins, transforming test data provisioning from a manual process into a continuous microservice.

The company's YAML-based data-as-code framework positions it to expand into policy-as-code governance tooling. This capability becomes increasingly valuable as organizations prepare for EU AI Act Article 10 dataset audit requirements that demand transparent, version-controlled data processing policies.

Synthesized can leverage its generative AI engine to move upstream into synthetic user journey generation. By creating realistic user behavior patterns and interaction sequences, the platform could capture budget currently allocated to Selenium scripting and UI test automation vendors.

Customer base expansion

The company's proven success with UBS and Deutsche Bank in highly regulated financial services provides a strong foundation for expansion into other compliance-heavy verticals. Healthcare organizations, e-commerce platforms, and telecommunications providers face similar data sharing restrictions and privacy requirements that synthetic data addresses.

Distribution through cloud marketplace listings on AWS, Azure, and Google Cloud Platform would expose Synthesized to millions of monthly enterprise buyers. These marketplaces already feature multiple synthetic data solutions, validating demand while providing integrated billing through existing cloud commitments.

Geographic expansion

The September 2025 Series A funding specifically targets go-to-market expansion in North America and continental Europe. The company already operates virtually in Japan, positioning it to capture growth in the Asia-Pacific region where the synthetic data market is forecast to grow at 35% CAGR through 2030.

European data protection regulations create particularly strong demand for privacy-preserving test data solutions. Synthesized's compliance-first approach and existing relationships with European financial institutions provide natural expansion opportunities across the continent.

Risks

Market consolidation: The synthetic data space faces increasing pressure from cloud hyperscalers embedding similar capabilities into broader platforms and established DevOps vendors acquiring specialized players. Perforce's Delphix acquisition and Nvidia's purchase of Gretel demonstrate how larger companies can bundle synthetic data generation with existing enterprise relationships, potentially commoditizing standalone solutions through integrated pricing and procurement advantages.

Open source competition: MOSTLY AI's decision to open-source its core synthetic data generation technology creates downward pressure on commercial pricing while lowering switching costs for enterprise buyers. As more synthetic data capabilities become freely available, Synthesized must continuously differentiate through advanced features, enterprise support, and specialized compliance capabilities to justify premium pricing.

Regulatory complexity: While data privacy regulations drive demand for synthetic data, the evolving compliance landscape also creates risks around liability and audit requirements. As synthetic data becomes subject to more rigorous privacy testing standards and potential membership inference audits, Synthesized faces increasing technical complexity and potential legal exposure if generated datasets fail to meet emerging privacy-by-design standards.

DISCLAIMERS

This report is for information purposes only and is not to be used or considered as an offer or the solicitation of an offer to sell or to buy or subscribe for securities or other financial instruments. Nothing in this report constitutes investment, legal, accounting or tax advice or a representation that any investment or strategy is suitable or appropriate to your individual circumstances or otherwise constitutes a personal trade recommendation to you.

This research report has been prepared solely by Sacra and should not be considered a product of any person or entity that makes such report available, if any.

Information and opinions presented in the sections of the report were obtained or derived from sources Sacra believes are reliable, but Sacra makes no representation as to their accuracy or completeness. Past performance should not be taken as an indication or guarantee of future performance, and no representation or warranty, express or implied, is made regarding future performance. Information, opinions and estimates contained in this report reflect a determination at its original date of publication by Sacra and are subject to change without notice.

Sacra accepts no liability for loss arising from the use of the material presented in this report, except that this exclusion of liability does not apply to the extent that liability arises under specific statutes or regulations applicable to Sacra. Sacra may have issued, and may in the future issue, other reports that are inconsistent with, and reach different conclusions from, the information presented in this report. Those reports reflect different assumptions, views and analytical methods of the analysts who prepared them and Sacra is under no obligation to ensure that such other reports are brought to the attention of any recipient of this report.

All rights reserved. All material presented in this report, unless specifically indicated otherwise is under copyright to Sacra. Sacra reserves any and all intellectual property rights in the report. All trademarks, service marks and logos used in this report are trademarks or service marks or registered trademarks or service marks of Sacra. Any modification, copying, displaying, distributing, transmitting, publishing, licensing, creating derivative works from, or selling any report is strictly prohibited. None of the material, nor its content, nor any copy of it, may be altered in any way, transmitted to, copied or distributed to any other party, without the prior express written permission of Sacra. Any unauthorized duplication, redistribution or disclosure of this report will result in prosecution.