Sacra Logo
View PDF
View Model
New York, NY
Florian Douetteau
Listed In
Home  >  Companies  >  Dataiku
Dataiku is an all-in-one, AI/ML-ready, managed data-science solution for enterprises.







Growth Rate (y/y)








Sacra estimates that Dataiku hit $250M in annual recurring revenue (ARR) at the end of 2023.

Company disclosures indicate that Dataiku hit $200M ARR at the end of 2022 and crossed $230M in September 2023.

At the end of 2023, Dataiku had roughly 500 customers for an average ARR per customer of $500K.

Key comps for Dataiku include Amplitude (~74% gross margins), Datadog (~81% gross margins) and Tableau (~90% gross margins).



Dataiku is an all-in-one, AI/ML-ready, managed data-science solution for enterprises.

Dataiku’s core product is a software platform called Data Science Studio (DSS) that helps big companies develop and deploy AI applications at scale.

Data Science Studio serves as a centralized hub where data scientists and engineers from across the organization can collaborate on preparing data, building and training machine learning models.

The platform provides a mix of visual interfaces and notebook environments for coding in languages like Python and R. Technical users can leverage Dataiku's built-in machine learning algorithms and functionality or incorporate custom code and open source libraries.

Dataiku also aims to make AI more accessible to non-technical employees through its drag-and-drop visual interface with the aim of letting analysts and other business-oriented team members identify use cases, annotate data, validate models, and build dashboards and AI apps in a self-service manner.

Business Model


Dataiku operates on a subscription-based software-as-a-service (SaaS) business model. Customers pay an annual subscription fee for access to the Dataiku Data Science Studio (DSS) platform.

Dataiku's pricing is based on number of users and the scale of compute resources consumed through the platform.

Customers can start with a small deployment for a data science team and then expand their usage over time by adding more users and compute resources as AI initiatives grow. This land-and-expand model allows Dataiku to grow revenue within accounts as customers derive more value from the platform.

In terms of go-to-market, Dataiku primarily sells to large enterprise customers across industries such as financial services, pharmaceuticals, manufacturing, and retail.

It uses a direct sales model complemented by partnerships with systems integrators, consultancies, and cloud vendors. These partners help provide implementation services and expand Dataiku's reach.

Dataiku has also introduced a fully managed, cloud-hosted version of its platform called Dataiku Online.

This SaaS offering makes it easier and faster for customers to get started with Dataiku, without needing to manage their own infrastructure. It also expands Dataiku's addressable market to smaller companies who may lack the IT resources to deploy and manage the software themselves.

In addition to subscription fees, Dataiku generates revenue from professional services. It offers training, consulting, and support services to help customers successfully deploy and adopt the platform. However, Dataiku has been shifting more of its professional services revenue to partners over time as it scales.



Dataiku faces competition from several other companies offering managed machine learning and data science platforms.

Key competitors include:

Focused primarily on automated machine learning (AutoML), DataRobot provides a simpler solution for setting up AI/ML pipelines for teams that already have data stored in spreadsheets and want a quick way to start using predictive analytics using pre-curated models.

Teams upload their data, and DataRobot will automatically build a model and find parameters to predict what the outputs (in another column of the spreadsheet) should be.

As of Q3 2023, DataRobot was reported to have a $1.5B valuation, $140M ARR, and was considering a strategic sale.


More analytics and dashboard-focused than Dataiku, Alteryx lacks some of Dataiku's model development, deployment and monitoring capabilities.

User sentiment surveys suggest Alteryx is more intuitive for simpler use cases, while Dataiku offers greater flexibility and end-to-end functionality. Non-technical users in particular find Alteryx's no-code tools useful.

Built around Apache Spark, Databricks is a strong competitor but is more focused on the data engineering and infrastructure layer compared to Dataiku's higher-level platform.

Databricks is ideal for teams that are already using Apache Spark and that want to integrate with other tools like MLFlow for workflow orchestration and AWS or Azure—Databricks essentially will connect all of these tools together with Spark and provide a single interface into managing the entire pipeline.

Hyper-scaler platforms like AWS Sagemaker and Azure ML

While major cloud platforms offer managed ML solutions, they have the disadvantage of vendor lock-in compared to Dataiku's cloud-agnostic approach.

Dataiku stands out for its end-to-end coverage of the data and ML lifecycle, its balance of visual interfaces and code-friendly tools that cater to both technical and non-technical users, and its ability to work across on-premises and cloud environments.

This positions it favorably for large enterprises looking to democratize AI/ML across business functions using a single platform.

TAM Expansion

Dataiku is at the revenue scale to be credible IPO candidate within the next few years, but it is also possible that it will become the target of an acquisition for a large tech platform that wants to buy their way into the enterprise AI/ML and advanced analytics markets.

News that DataRobot was looking at a strategic sale at $140M ARR as of Q3'23 bolsters the case for this kind of outcome for Dataiku.

However, as a standalone company, there are a few key ways that Dataiku can still expand its total addressable market (TAM).

Expanding wall-to-wall through no-code

Dataiku's platform is increasingly looking to bring in non-technical users like business analysts and domain experts to participate in AI/ML projects alongside data scientists.

As companies seek to democratize AI/ML and infuse it into more business processes, pushing this kind of collaborative, self-service approach can help drive adoption beyond just the data science and data engineering teams, growing Dataiku's number of seats inside the organization and driving higher average revenue per customer (ARPC).

Growth in the midmarket and SMBs

Dataiku's focus to date has primarily been on large enterprises, but the launch of its fully managed cloud offering, Dataiku Online, opens up the midmarket and SMB segments. These companies often lack the IT resources to manage on-premises deployments but still have significant data assets and AI/ML needs.


This report is for information purposes only and is not to be used or considered as an offer or the solicitation of an offer to sell or to buy or subscribe for securities or other financial instruments. Nothing in this report constitutes investment, legal, accounting or tax advice or a representation that any investment or strategy is suitable or appropriate to your individual circumstances or otherwise constitutes a personal trade recommendation to you.

This research report has been prepared solely by Sacra and should not be considered a product of any person or entity that makes such report available, if any.

Information and opinions presented in the sections of the report were obtained or derived from sources Sacra believes are reliable, but Sacra makes no representation as to their accuracy or completeness. Past performance should not be taken as an indication or guarantee of future performance, and no representation or warranty, express or implied, is made regarding future performance. Information, opinions and estimates contained in this report reflect a determination at its original date of publication by Sacra and are subject to change without notice.

Sacra accepts no liability for loss arising from the use of the material presented in this report, except that this exclusion of liability does not apply to the extent that liability arises under specific statutes or regulations applicable to Sacra. Sacra may have issued, and may in the future issue, other reports that are inconsistent with, and reach different conclusions from, the information presented in this report. Those reports reflect different assumptions, views and analytical methods of the analysts who prepared them and Sacra is under no obligation to ensure that such other reports are brought to the attention of any recipient of this report.

All rights reserved. All material presented in this report, unless specifically indicated otherwise is under copyright to Sacra. Sacra reserves any and all intellectual property rights in the report. All trademarks, service marks and logos used in this report are trademarks or service marks or registered trademarks or service marks of Sacra. Any modification, copying, displaying, distributing, transmitting, publishing, licensing, creating derivative works from, or selling any report is strictly prohibited. None of the material, nor its content, nor any copy of it, may be altered in any way, transmitted to, copied or distributed to any other party, without the prior express written permission of Sacra. Any unauthorized duplication, redistribution or disclosure of this report will result in prosecution.