Zachary Friedman, associate director of product management at Immuta, on security in the modern data stack

Background

Zachary Friedman is the associate director of product management at Immuta. In our chat with Zachary, we discussed the taxonomy of the cybersecurity, data security and data governance markets, looked the competitive positioning of companies like Immuta, BigID, Privitar, and others, and explored how regulatory changes and the growing need for data privacy are influencing strategies and decisions in the data security industry.

Questions

Can you give us a brief history of data governance as you see it? What did the world look like before Immuta and what does it look like after?
Who is Immuta's customer? What prompts them to buy?
For the customer, is Immuta a revenue driver in that it gives companies trust when they’re trying to close big enterprise deals? Or is it more about de-risking security breaches and data leaks? Is it more revenue driver or de-risker or both?
Who in the org buys and who implements? What does an implementation look like? How does the customer measure success?
You mentioned that Immuta ties together employee identity with your data store and maps these policies for who can access what. How does that typically work? Do people connect their Okta and their data store and then set this up in Immuta?
Can you talk about the depth of integration Immuta has into different data stores? Does the depth of integration differ if it's a first-party integration where they're an investor like an Okta, Snowflake or Databricks vs. a third-party integration?
Immuta existed before the modern data stack, and you indicated that there was this moment or time when the modern data stack started to be a big driver. What, if anything, changed? Did it change this notion from an individual's access to data to more aggregated metrics across all the tables that you have access to or not? Did it also give rise to this new type of end-user who was not just accessing tables but also generating metrics across the data?
A lot of the highest-flying private companies are in what you might generically call cybersecurity. What are the tailwinds driving cybersecurity generally, and is it a big tailwind behind Immuta? How else would you taxonomize the things that are tailwinds propelling Immuta?
You mentioned Wiz, and then there's a few other companies that feel either adjacent or competitive like Rubrik and BigID. A lot of them use terms like data visibility, data control, and data access to describe what they do. Can you talk about some of these subcategories, where they overlap, and where they butt up against each other?
A lot of the core of Immuta seems to be protecting data, but internally, versus a product like BigID, which is maybe more focused on protecting data in the abstract. Is that accurate?
Rubrik crossed over into cybersecurity, but they were initially focused on data apps and data recovery. Part of doing that is securing the data. That seems to maybe create overlap with a product like Immuta. Do you in general see more companies crossing over into cybersecurity and having some overlap with a company like Immuta and what's your take on that?
Snowflake and Databricks are investors in Immuta, but Immuta also helps unlock more data for them—it’s a revenue driver for them. Does the fact that you're a partner, and you also drive more revenue for them, and you both have these products around data access, create a threat that they can become a frenemy and try to build a more integrated platform that eats up the Immuta use case? What makes it such that Immuta necessarily needs to be separate, or is there such a dynamic at all?
Over the near-term time horizon—let’s say 5 years—do you see Immuta adding more products, or do you see enabling more data or adding more seats as the main driver of growth? If everything goes right for Immuta, what has it become in some more abstract sense and how has the world changed as a result?

Interview

Can you give us a brief history of data governance as you see it? What did the world look like before Immuta and what does it look like after?

Let's consider what the world looked like before Immuta and what it looks like after.

First, any conversation about Immuta would be incomplete without discussing the initial market. The company's founders come from the US intelligence community and started the company based on the recurring problems they solved for customers, primarily in the intelligence community.

Initially, our customers were all from the US public sector, where we protected highly sensitive data, including national security top secret classified information. In order to expand our opportunities and market reach, we identified the next adjacent market to the US public sector, which included large banks, financial services and insurance companies, healthcare companies, major pharmaceutical companies, hospitals, and similar entities.

After establishing ourselves in the public sector, we expanded our services to cater to these industries within the Fortune 500 and Global 2000. It's important to note that my experience does not involve the US government sector, but rather focuses on the history of data governance and the productization of established principles for protecting top secret and confidential data by the founders of Immuta.

As we ventured into the commercial sector, we found that some aspects of data management and governance in the military were similar to the needs of large banks and pharmaceutical companies, while others were not.

Over time, we worked on merging these two worlds to create a product that now covers a broad range of requirements and serves the needs of various communities. It's worth mentioning that our clientele extends beyond the financial services and homeland security sectors. We now cater to a large portion of the Global 2000 and numerous big startups as well. Therefore, the market landscape and timeline of Immuta as a company have evolved significantly.

When Immuta was founded around eight years ago (in 2015), it was before the rise of tools like dbt and the modern data stack. I joined the company in January 2021, during what I would call the Cambrian explosion days of the modern data stack. At that time, the attention and focus on data were not as pronounced as they are today. The modern data stack can be seen as a collection of tools, including dbt, but it also represents a shift in the majority of companies leveraging data to make informed business decisions. Now, most companies understand the importance of using data to operate their business, and those that fail to do so are at risk of becoming obsolete.

Currently, Immuta operates in a space where data governance and data access controls are crucial elements of managing data. Access controls might not be the first thing that comes to mind when thinking about data management, but due to the significant value derived from emerging data management practices over the past five years, it has become increasingly important for many businesses. This is where our product plays a vital role.

Who is Immuta's customer? What prompts them to buy?

For a prototypical customer, let's just say a large bank. One thing I've noticed from talking to customers extensively is that regardless of the terminology they use, many are adopting a data mesh approach. This means that the authority to manage access to data is being decentralized. In the past, it was often a central team that determined who could query specific tables and what rows they could see. However, now this responsibility is being distributed across multiple business units within our customers' organizations.

A few things prompt them to buy. One is that our product works really well if you use one data platform, but there's advantages if you're using more than one data platform that are even greater.

At its core, Immuta Secure is a policy engine that authorizes users to access data within the data platform. We offer various ways for customers to define those policies using terms that are native to their business, rather than being tied to a specific data platform.

For instance, if a customer is using Snowflake, Databricks, Starburst, Redshift, and BigQuery (which many large companies do), they can write a single policy in Immuta and enforce it across all platforms. This becomes particularly valuable when the authority to set policies is distributed across multiple data platforms or even within a single platform. One of the main reasons customers choose to buy our product is that it significantly reduces costs, including opportunity costs and complexities. We handle the heavy lifting of translating their business terms into database grants or policies, eliminating the need for them to worry about the underlying technical aspects of the databases. They can simply define policies in plain English, such as "users should be able to see this table if blank."

That means our product offers a lot of benefits, especially for those using a single data platform with many tables, vast amounts of data, and frequently changing access requirements or authorization attributes. Additionally, there are advantages in abstracting the differences in syntax and security models between different data platforms.

For the customer, is Immuta a revenue driver in that it gives companies trust when they’re trying to close big enterprise deals? Or is it more about de-risking security breaches and data leaks? Is it more revenue driver or de-risker or both?

It's both, but as far as how it can be a revenue driver, it’s not exactly what you said. What we see people using it for is more about data sharing. Once people can safely securely share data with other companies without divulging trade secrets, that's going to be a one-way door.

Once companies can securely and confidently share data with other organizations without revealing trade secrets, it will become an irreversible trend. We anticipate that this practice will continue to grow.

We have many customers who utilize Immuta as a foundational platform for offering a multi-tenant service to their own customers. This setup allows for the benefits of shared data while ensuring that each tenant maintains control over their own data. It's comparable to using a programming language like Python as the input for developing a multi-tenancy model, ensuring that companies like Pepsi cannot access data belonging to Coca-Cola. This aspect can also serve as a revenue driver. Additionally, risk mitigation is another significant use case and driver for adopting Immuta.

Who in the org buys and who implements? What does an implementation look like? How does the customer measure success?

In complex enterprise sales, there are typically two types of buyers: the executive buyer and the technical buyer or user. The users, who are responsible for setting policies and determining data access, prioritize ease of use and appreciate the various modeling options available in Immuta. They can model their business domain within our platform and express policies based on that model, applying a semantic layer to their data. On the other hand, the executive buyer is motivated by the priority and significance of the solution.

A common scenario we often encounter, though less frequent nowadays, involves organizations migrating from legacy technologies to cloud-native data warehouses, platforms, or lake houses. These migrations include moving from Hadoop to Databricks, Teradata to Snowflake, or transitioning from traditional OLTP databases to purpose-built solutions like Snowflake, or even utilizing Starburst as a bridge between on-premises and cloud environments.

One of the challenges that arise during these migrations, particularly when organizations are not fully on the cloud, is security concerns associated with leaving on-premises infrastructure. Immuta plays a critical role in unblocking these migrations by addressing data sensitivity and providing a safe solution. As a result, we often become the answer to this security challenge, creating a compelling event for the buyer.

Implementation is straightforward for the user. They simply specify the data they want Immuta to protect, define the users subject to access restrictions, and write simple policy statements in plain English that establish the connection between users, data attributes, and user attributes. Policies are often easy to comprehend and have low cognitive complexity. The user can observe the policy in action within the platform.

When it comes to measuring success, customers have a set of internal and external requirements they aim to fulfill. The adoption of Immuta usually starts gradually, and success is measured by the gradual onboarding process, achieving internal policy compliance, ensuring the right personnel are authorized to modify policies if needed, and meeting the needs of downstream customers. Many of our customers act as platform teams or are responsible for providing entitlement services. Therefore, success is defined not only by our direct customers but also by the success of their customers, who utilize the enterprise data platform securely and derive value from it while having a clear understanding of entitlements and their purpose.

You mentioned that Immuta ties together employee identity with your data store and maps these policies for who can access what. How does that typically work? Do people connect their Okta and their data store and then set this up in Immuta?

That's exactly right. Okta Ventures is actually an investor in Immuta, so that’s a good example, but yes—that's exactly right.

That said, we support a number of cloud native data platforms and IdPs. Customers will sometimes roll their own IdPs and we support those as well.

Can you talk about the depth of integration Immuta has into different data stores? Does the depth of integration differ if it's a first-party integration where they're an investor like an Okta, Snowflake or Databricks vs. a third-party integration?

The depth of integration does not vary based on whether a company is an investor or not. We do not differentiate between first-party and third-party integrations since all our integrations are first-party.

The distinction between a formal partnership or investment and comparing companies like Databricks and Snowflake to hyperscalers like AWS and GCP (Redshift and BigQuery) lies in the level of formalized go-to-market partnership. This doesn't mean that we don't engage in go-to-market activities with hyperscalers. The main difference is the investment made by both parties in aligning their go-to-market strategies.

A couple of years ago, we decided clearly our approach is to go deep and provide excellent support for a select number of data stores. At that time, we supported six data stores and had perfected our integrations with them. The depth of integration we offer for the supported data stores is exceptionally high. With regard to our integration with Snowflake, I might be biased, but I believe we have the best and most comprehensive Snowflake integration in the market compared to companies in our space or similar spaces.

I mention this because many of our direct competitors in the access controls domain may not prioritize the same level of depth and integration as we do. However, I'd like to note that our company is expanding into other areas, as evident in our Detect and Discover products. Initially, our focus was securing data warehouses, and for a significant period, that was our sole business. This focus allowed us to develop extensive capabilities in taking action and detecting potential anomalies.

For instance, if we compare ourselves to a company like Wiz, which is an impressive success story and the fastest-growing SaaS company of all time, their initial approach might have been broader, such as detecting issues on AWS. At some point, our paths may intersect as we both move towards protecting platforms like Redshift, but our starting points and the depth of coverage differ, highlighting the explanation for the disparity.

Immuta existed before the modern data stack, and you indicated that there was this moment or time when the modern data stack started to be a big driver. What, if anything, changed? Did it change this notion from an individual's access to data to more aggregated metrics across all the tables that you have access to or not? Did it also give rise to this new type of end-user who was not just accessing tables but also generating metrics across the data?

There's an undeniable connection in my mind between the modern data stack and dbt. However, despite their simultaneous rise, I don't necessarily link the dbt flavor of the modern data stack to the explosion of that idea in the enterprise and Global 2000. Although many Fortune 500 companies and larger enterprises do use dbt, it may not be discussed as extensively.

I recall a conversation with one of our customers a year or two ago. They mentioned that their company, with a large number of employees, provided Python training to thousands of people who wouldn't typically be associated with the role of data analysts. As a result, the customer noted that "now everybody's an amateur data analyst."

Self-serve data analytics has been a topic of discussion for a long time. However, the modern data stack brought about a hierarchical shift in needs.

Once organizations establish data platforms on modern technology, the desire for self-serving data becomes a prominent factor. This creates an influx of users conducting interactive queries against the data platform, requiring entitlements for their access. With numerous users, each with different access permutations, the traditional approach of creating multiple copies of data with different access becomes problematic. Apart from the risk associated with data duplication, adhering to regulations like GDPR becomes challenging. Organizations need a single copy of the data while accommodating the growing demand for self-service interactive queries.

This is where attribute-based dynamic data masking, a key feature of the Immuta platform, becomes crucial.

Immuta helps solve the problem without requiring users to manage code across multiple data platforms. Customers can't keep creating multiple copies of data or establish a data breadline, as previously mentioned. Immuta's solution eliminates the need for technical change management across various data platforms, making the process seamless.

Regarding the data analyst experience, this is a vital aspect, especially for our product team. Optimizing for having cognitive empathy for users and making correct product choices are essential considerations. This is particularly true for customers who have their own customers and provide entitlement as a service.

For data analysts and users conducting interactive queries subject to entitlements, it is essential to minimize or ideally eliminate technical change management when interacting with the platform. Immuta ensures near-zero change management by protecting tables using features like Snowflake column-level access policies and row access policies. Users can query the same table with the same queries, but the data they receive is filtered based on their entitlements according to internal and external regulations.

This approach differs from earlier solutions that relied on an access-controlled proxy. If you're investing $20 million a year in Snowflake but telling nobody to query Snowflake directly, there's a fundamental issue. Immuta's integration with Snowflake is highly successful because it avoids unnecessary change management for users.

Our product partnership with Snowflake's team, from top to bottom, has been remarkable. Similarly, the integration with Databricks, particularly with Unity Catalog, is significant. Our Unity Catalog integration, released on May 2nd, enables zero technical change management for data protection in Databricks, regardless of the specific Databricks product being used. This integration has the potential to be transformational for our customers.

A lot of the highest-flying private companies are in what you might generically call cybersecurity. What are the tailwinds driving cybersecurity generally, and is it a big tailwind behind Immuta? How else would you taxonomize the things that are tailwinds propelling Immuta?

Cybersecurity is certainly a tailwind and we doubled down on that realization this year by introducing a new product called Detect. As I mentioned earlier, the Immuta secure product has been established for around six to seven years and is deeply focused on data protection. However, Detect takes a different approach by addressing the notion that people may not always be aware of what needs to be protected. While our existing product allows users to specify access restrictions for tables, there are numerous use cases where users lack awareness.

Detect helps identify scenarios such as overprovisioned access or excessive querying by a single user during odd hours. It provides insights and alerts that can be shared with data owners and platform administrators. The magic of Detect lies in its ability to create a detect-and-secure loop. It points out potential issues, flags them for review, and offers remediation. With our six years of experience building the policy engine, Detect can generate possible requirements based on the existing access patterns to your data. Users can then simply click a button to remediate the identified issues using the secure product.

This is where we see significant synergistic advantages between the two modules or products. While many solutions can identify potential problems, we can close the loop by securing access based on the insights provided by Detect. By combining these capabilities, we offer a comprehensive solution for data protection and remediation.

You mentioned Wiz, and then there's a few other companies that feel either adjacent or competitive like Rubrik and BigID. A lot of them use terms like data visibility, data control, and data access to describe what they do. Can you talk about some of these subcategories, where they overlap, and where they butt up against each other?

While I hesitate to say that Immuta's module or product directly competes with BigID, there is some overlap. One of the use cases that customers have is the need to semantically protect data based on attributes. Immuta addresses this with its Discover module, which allows users to identify and classify attributes within their data. Similarly, BigID offers a product that analyzes and categorizes data types. Both products serve valid use cases, but with some differences. Immuta's Discover module focuses on identifying and tagging classified entities to enforce relevant policies. For instance, it can automatically tag a column as an address, which would then be subject to a policy stating that only specific groups can access it. BigID also provides similar capabilities with Blue Code and integrates with other products in the ecosystem.

It's important to note that Immuta's three modules, Secure, Detect, and Discover, form a data security platform where they symbiotically feed into one another. Each module plays a significant role, and the value they provide is enhanced when used together. We have customers who use BigID alongside Immuta, as well as customers who solely use Immuta's Discover module without BigID. Similarly, we have customers who use Immuta as their data catalog, even though we're not primarily a data catalog, while others choose to use tools like Alation. We remain agnostic to these choices, as our product functions well in both scenarios. Considering the market size, it's evident that there is room for multiple products to serve different use cases and cater to different buyers in various circumstances.

A lot of the core of Immuta seems to be protecting data, but internally, versus a product like BigID, which is maybe more focused on protecting data in the abstract. Is that accurate?

Absolutely, insider threats are a significant concern, and implementing best practices is crucial not only for security but also for adhering to legal requirements and facilitating audits. In line with this, our Detect product includes the Unified Audit Model (UAM), which stands for Unified Audit Model. UAM allows for a unified audit log across different data platforms, leveraging a unified model enriched with policy information in Immuta.

Similar to how Immuta abstracts over data platform differences when securing data, we also do the same for audit purposes. This means that regardless of the data platform used, Immuta provides a single audit log in a unified model, incorporating policy information. For instance, if someone runs a query and accesses specific data due to a particular policy, this information is captured in the unified audit log. While Snowflake alone cannot provide this capability, through collaboration and authorization, Immuta can ingest and leverage Snowflake's audit logs to enable the Unified Audit Model.

The Unified Audit Model unlocks programmable compliance, making it easier to demonstrate compliance to auditors. Typically, CISOs, individuals in the office of CISO, or those working in SOCs may have auditors spending hours or even days observing and analyzing their activities. However, with the Unified Audit Model, we provide an API that generates a report showcasing every query executed, the data accessed, the underlying policies, and their representation. This significantly streamlines the auditing process and saves time for both auditors and security professionals.

In addition to mitigating insider threats, the Unified Audit Model offers programmatic compliance, ensuring that the right individuals within an organization have appropriate access while facilitating efficient audits and providing peace of mind to security teams.

Rubrik crossed over into cybersecurity, but they were initially focused on data apps and data recovery. Part of doing that is securing the data. That seems to maybe create overlap with a product like Immuta. Do you in general see more companies crossing over into cybersecurity and having some overlap with a company like Immuta and what's your take on that?

While it's true that some companies, like Rubrik, have expanded into the cybersecurity space and may overlap with aspects of Immuta's offerings, the overlap is not extensive. Rubrik primarily focuses on data protection, including Microsoft 365, ransomware monitoring, and application-related services. They have a broader scope and cover a wider range of areas.

On the other hand, when it comes to Immuta's specialization, we have a deep integration with cloud-native analytical data platforms such as Databricks and Snowflake. Our focus is on providing a comprehensive data security platform with a particular emphasis on securing and governing data access in these specific environments.

Rather than viewing it as an "Immuta versus Rubrik" scenario, it's more accurate to see it as an "Immuta and Rubrik" situation. While there may be some overlap in certain areas, the two companies address different problems and can complement each other. If organizations require comprehensive data security and protection, a data security platform like Immuta can be utilized alongside solutions such as Rubrik, Wiz, or others, depending on the specific needs and requirements.

Snowflake and Databricks are investors in Immuta, but Immuta also helps unlock more data for them—it’s a revenue driver for them. Does the fact that you're a partner, and you also drive more revenue for them, and you both have these products around data access, create a threat that they can become a frenemy and try to build a more integrated platform that eats up the Immuta use case? What makes it such that Immuta necessarily needs to be separate, or is there such a dynamic at all?

I mean, it's a good question. If I said we weren't discussing these questions internally two years ago, I'd be lying. However, what we've observed is that as our data platform partners enhance their data access control features, it improves our integration. Our customers demand seamless data protection during interactive queries without any noticeable impact from our product. Each time a data platform invests in access control features, it enhances our product and drives more sales. Over the past two years, since deepening our partnership with Snowflake, we've seen this play out successfully. We're also witnessing similar positive outcomes with Databricks and their Unity Catalog.

While it's uncertain what the future holds, our experience so far indicates that deeper investments by data platforms in this space benefit both parties. Data platforms are unlikely to abstract themselves to accommodate multiple competitive products. It's important to note that Databricks and Snowflake are not allies, but rather competitors. However, there is audience overlap, and our executive buyers and technical buyers sometimes align. The primary focus for Snowflake and Databricks is to build the best product for their core audiences. Snowflake, for instance, has established a standard, though not an open one, for row and column-level permissions, and their product choices have been widely followed due to their user-friendly design and ease of use.

In summary, I don't anticipate data platforms extending much beyond making it as easy and clear as possible to express policy, particularly in SQL or their platform-specific query languages. We believe that our specialized data security platform and the capabilities offered by our partners can coexist and thrive together.

Over the near-term time horizon—let’s say 5 years—do you see Immuta adding more products, or do you see enabling more data or adding more seats as the main driver of growth? If everything goes right for Immuta, what has it become in some more abstract sense and how has the world changed as a result?

What we imagine is a world where not being able to secure access is not a reason that’s ever cited for not adopting a modern data platform, period, end of story. I think us and our customers and our future customers and our partners would all be very happy if that's the change that we create in the world.

Disclaimers

This transcript is for information purposes only and does not constitute advice of any type or trade recommendation and should not form the basis of any investment decision. Sacra accepts no liability for the transcript or for any errors, omissions or inaccuracies in respect of it. The views of the experts expressed in the transcript are those of the experts and they are not endorsed by, nor do they represent the opinion of Sacra. Sacra reserves all copyright, intellectual property rights in the transcript. Any modification, copying, displaying, distributing, transmitting, publishing, licensing, creating derivative works from, or selling any transcript is strictly prohibited.

Zachary Friedman, associate director of product management at Immuta, on security in the modern data stack

Background

Questions

Interview

Disclaimers

Read more from
#b2b

Clio at $300M/year

Danny Wheller, VP of Business & Strategy at Hebbia, on vertical vs horizontal enterprise AI

Clay revenue, growth, and valuation

Read more from
#cybersecurity

$40M/yr Vanta for containers

Wiz passes $500M ARR

Snyk at $300M ARR

Read more from
#data-security

Jumpcloud revenue, growth, and valuation

Rubrik: the Netflix of data backups

BigID revenue, growth, and valuation

Create a free account, or log in.

Free article limit reached.

Standard membership required.

Standard membership required.

Background

Questions

Interview

Disclaimers

Read more from #b2b

Clio at $300M/year

Danny Wheller, VP of Business & Strategy at Hebbia, on vertical vs horizontal enterprise AI

Clay revenue, growth, and valuation

Read more from #cybersecurity

$40M/yr Vanta for containers

Wiz passes $500M ARR

Snyk at $300M ARR

Read more from #data-security

Jumpcloud revenue, growth, and valuation

Rubrik: the Netflix of data backups

BigID revenue, growth, and valuation

Read more from
#b2b

Read more from
#cybersecurity

Read more from
#data-security