Renen Hallak, CEO of VAST Data, on AI agents creating infinite storage demand

Jan-Erik Asplund
View PDF
None

Background

To learn more about where the AI stack is heading as agentic workloads scale, we reached out to Renen Hallak, founder and CEO of VAST Data ($30B valuation, $841M raised, Norwest).

Key points from our conversation via Sacra AI:

  • In a data center, hundreds of thousands of GPUs need constant shared access to training data, checkpoints, embeddings, model outputs, and agent memory, pushing AI labs (xAI), neoclouds (CoreWeave, Lambda, Crusoe), and hyperscalers (Microsoft, Google) toward VAST Data’s separated compute/shared flash architecture—which maps well to GPU clusters with many machines reading and writing to one data pool—versus high-speed storage systems designed for supercomputing (Weka, DDN) and reliable storage for basic IT (Pure, Dell, NetApp). "In computer science, there is a trade-off between price, performance, scale, resilience, and ease of use. Systems are good at one or two of these things, not all of them. For AI, we needed one system that was dramatically better than the best at all of these things simultaneously. We needed to break that trade-off. That is what our architecture does, we call it disaggregated shared everything... The new stack needs something far more scalable and performant than what HPC ever was, while still being resilient, easy to use, and secure, because enterprises need to put these workloads into production.”
  • As AI shifts from chatbots to agents, massive amounts of unsupervised tool calls, generated code, and persistent agent memory require multimodal filesystems that handle PDFs, audios, and video rather than just structured data, and these compounding amounts of data must be stored and accessible via fast access, driving storage demand up and flash memory prices up 8x, a tailwind behind infrastructure that makes storage more efficient and works with low-cost or retrofitted flash. "Agents are also generating information — AI coders generating software, AI filmmakers generating video — multiplied by millions, then billions, then trillions of agents, creating three compounding exponents of data that has to be stored forever with very fast access... which is why we built our own data reduction mechanism, our own efficient data protection mechanism, and our own data placement mechanism that allows you to use low-cost flash instead of high-end enterprise flash."
  • The upside opportunity is to serve as the unifying AI software operating system that powers the infrastructure layer of NVIDIA CEO Jensen Huang’s AI five-layer cake, abstracting away the GPUs & hardware underneath for the developers building models & applications above, expanding the market from model companies & neoclouds into the long tail of enterprises building their own secure, access-controlled agentic applications. "As we add all of these parts, we are filling out the software infrastructure layer that Jensen described with his five-layer cake analogy... abstracting this new hardware — GPUs, DPUs, TPUs, fast networking, large SSDs — away from new applications and providing easy APIs and tools for a far larger number of AI applications to be built, because to move from early adopter AI labs into enterprises, things need to be very simple, very secure, and compliant with regulation."

Questions

  1. You founded VAST Data in 2016. What was the founding insight behind the company?
  2. You're now calling this an AI operating system. What does that mean exactly, and what was the big technical challenge you overcame to get here?
  3. Your 1,000-plus customers have access to the Data Store, the Database, and the Data Engine. Do they typically start with the Data Store as a wedge? Is there a typical adoption ladder, or are many customers adopting the full stack from the start?
  4. In percentage terms, what is the rough split between customers using VAST DataStore only versus those consuming three or more parts of the bundle?
  5. A VAST social post yesterday mentioned that this year you would be working more closely with CSPs, Google and Microsoft. Can you say more about that?
  6. You have mentioned maintaining 90% gross margins, which is surprising for an infrastructure company. What drives that margin profile and how have you kept it stable at scale?
  7. How do customers evaluate VAST? From a non-technical perspective, you seem to overlap with a lot of products across different areas. What are they evaluating against, other vendors, internal builds, and what criteria drive those decisions?
  8. Would you push back against a taxonomy that lumps VAST in with companies like Weka, DDN, Pure Storage, or Dell?
  9. Is there a coopetition dynamic with Databricks and Snowflake, or are they part of the old stack too?
  10. The memory price surge is getting a lot of attention. As an all-flash company, does that affect you?
  11. How should we think about the neocloud market evolving? Are you seeing convergence? Are they building differently from each other?
  12. If consolidation happens, does it matter much to you as long as the lion's share of capacity standardizes on VAST?
  13. Thinking about a future with billions of agents interacting with, generating, and holding data, you mentioned a 1,000% increase in demand coming. Is VAST prepared for that world? What still needs to be built?
  14. Your post mentioned one or more VAST clusters deployed in space by the end of the year. Can you say more about that?
  15. Doubling back to a customer persona we have not fully covered, Sovereign AI. How meaningful is that relative to the broader commercial opportunity?
  16. Commercially, what matters more to you right now, expanding your GPU footprint horizontally, or moving up the value stack with customers adopting more of the Data Engine and higher-level services?
  17. Last question: given everything you have described, why is VAST relatively under-discussed compared to companies like OpenAI, Anthropic, or even Databricks? It does not have the same household name recognition.

Interview

You founded VAST Data in 2016. What was the founding insight behind the company?

We started in 2016. I came from EMC, which was at the time the biggest storage company in the world. In 2015, when we started thinking about this, was right after Google bought DeepMind in 2014. For the first time since I was in school, when we learned about neural nets as a curiosity, something that was an interesting idea that didn't really work, DeepMind and a few others proved that it does work, that it just needs a lot more access to a lot more data. They showed that you can get a computer to mimic the human brain without really understanding how the human brain works: recognizing cats in YouTube videos, playing Go, doing things that were not trivial. It was already clear that if you could give these algorithms more access to more data, you could get them closer and closer to the human brain, and maybe eventually surpass it in terms of intelligence.

I looked around at the infrastructure at EMC and saw that none of the systems were good enough for this new world. We had a clear trade-off between fast systems that were small and expensive, and large systems that were slow and cost-effective. That was fine when you were only doing analysis on numbers and columns in a database. You didn't need fast access to pictures, video, sound, or genomes. But this new era did. It was no longer a few CPUs doing the analysis; it was a lot of GPUs. Now we see clusters close to a million GPUs in a single cluster. We needed a new architecture, a new way of building infrastructure that provided fast access to a lot of data. That's where we started.

You're now calling this an AI operating system. What does that mean exactly, and what was the big technical challenge you overcame to get here?

The big technical challenge was building that new architecture. In computer science, there is a trade-off between price, performance, scale, resilience, and ease of use. Systems are good at one or two of these things, not all of them. For AI, we needed one system that was dramatically better than the best at all of these things simultaneously. We needed to break that trade-off. That is what our architecture does. We call it disaggregated shared everything. Everything we have built since is on top of that architecture, which allows us to meet the scale, performance, and requirements of these new workloads.

In the early days, nobody wanted to use our product because nobody needed that level of performance and scale yet. The only ones we could get to work with us were companies on the bleeding edge. This was pre-generative AI, so it wasn't OpenAI and xAI yet. It was hedge funds doing quant trading, life science institutes doing genomics and medical imaging analysis, autonomous driving companies building self-driving capabilities, and government agencies analyzing satellite imagery. That turned out to be a blessing in disguise, because they showed us the future of what AI could be and how it could evolve. Over the years, they kept asking for more and more functionality.

We started by giving them an unstructured data store with a file interface and an object interface and added storage functionality as they asked for it. Then they said they also needed us to break the trade-off for structured data. We needed to build a new type of database that could store a trillion rows, because you now have a trillion vector embeddings, and that could handle millions of agents querying in parallel, because that is the world we are heading into. So in addition to the VAST DataStore, we built the VAST DataBase, again based on the same architecture.

Then the world shifted to generative AI and our customer base evolved to include not just those early adopters but model builders like Microsoft, Mistral, and xAI, and new AI clouds like CoreWeave, Lambda, and Crusoe. These new AI companies didn't want to build applications in the old way of writing to a data platform and reading from it. They wanted everything to be data-driven. So we added what we call the VAST DataEngine, which is the compute orchestration layer. It triggers functions based on data as it comes in, and as we learn new things about the data, metadata triggers other functions, training, inference, fine-tuning, running on different types of processors at different levels of urgency and across different geographies.

As we add all of these parts, we are filling out the software infrastructure layer that Jensen described with his five-layer cake analogy: power, hardware, software infrastructure, models, and applications. We are filling in that middle layer. The reason I call it the operating system is because that is what it is. We are abstracting this new hardware, GPUs, DPUs, TPUs, fast networking, large SSDs, away from new applications and providing easy APIs and tools for a far larger number of AI applications to be built.

Today we have chatbot applications, and we are starting to have software developer applications. But for lawyers, accountants, architects, and designers, the virtual ones, and then for robots that need to learn construction work, gardening, and carpentry, all of those applications require the operating system. To move from early adopter AI labs into enterprises, things need to be very simple, very secure, and compliant with regulation. The operating system is what allows that to happen. We are now adding security features, observability features, agentic habitats, RAG-in-a-box, all part of the software layer we provide.

Your 1,000-plus customers have access to the Data Store, the Database, and the Data Engine. Do they typically start with the Data Store as a wedge? Is there a typical adoption ladder, or are many customers adopting the full stack from the start?

The early customers definitely started with just the VAST DataStore, because that is all we had. Some of them are still using just the storage system; others have over the years moved up to the higher layers. That is the first pillar of customers.

The second pillar, the AI labs and AI clouds, most of them started with the whole stack, because it was available and they needed all the parts to build new AI applications. Some AI clouds are still using just one component, like the VAST DataBase and the VAST DataStore, because they are offering specific services to their end–users and can pick and choose. Some limit what their end–users can access from our platform; others provide the full stack.

The third pillar, which we have been focused on for the last eighteen to twenty-four months, is enterprise. There, it really depends on what they are using us for. If they are starting with AI, they tend to use the whole stack. A lot of enterprises use what we call the VAST InsightEngine, developed in conjunction with NVIDIA.

It’s NVIDIA hardware underneath, NVIDIA models on top. A document enters through the VAST DataEngine, gets inferred, we store the vectors in the VAST DataBase, and then you have a chatbot interface to all of your information with access control enforced. If you are in HR, you see certain information. If you are in engineering, you see different information. You get different answers to the same questions. That enterprise may not even know all the parts inside; they just bought a solution. Other enterprises start with us as a backup system and then gradually add more over time.

In percentage terms, what is the rough split between customers using VAST DataStore only versus those consuming three or more parts of the bundle?

Two years ago, VAST DataStore only was probably 70%. Today it has flipped. The bundle is about 70%. The AI companies have been growing so fast, and they tend to use more parts of the stack.

A VAST social post yesterday mentioned that this year you would be working more closely with CSPs, Google and Microsoft. Can you say more about that?

Some of this is confidential, but both companies have put out press releases about our collaboration. Before getting into deeper integrations, you can use VAST today in those clouds as a software layer. As part of our Data Space, we have companies using VAST in Amazon, in CoreWeave, and on-prem, and we stitch all of that together into one abstraction layer across infrastructure providers and geographies.

What we are working on with the cloud providers is a deeper integration, because today the lower layers of their stack were not built for AI. You do not get the same level of performance, cost-effectiveness, or scalability in those big clouds that you get in an AI cloud or on-prem. We’re working with the major cloud providers on deeper integrations to better support AI workloads.

Wherever there is AI, whether inside the cloud or outside it, those companies are embedding our software layer as the middle-of-the-stack component, sitting on top of the hardware layers, whether in-house or from NVIDIA, AMD, or Cerebras, and underneath their own models.

You have mentioned maintaining 90% gross margins, which is surprising for an infrastructure company. What drives that margin profile and how have you kept it stable at scale?

We have not kept it stable. It has been going up. Two years ago it was 88-89%, and last year it was 93-94%. That does not sound like a big difference, but it represents roughly halving our cost structure. Our sales are roughly tripling year over year, while headcount, which, as a software company, correlates directly to our expenses, has been growing by about 30% year over year. The gap between those two keeps widening.

The majority of our COGS is support, mostly support headcount with a little cloud cost mixed in. As customers grow with us, we do not need proportionally more support people. Support headcount correlates to the number of customers, not to the size of the estate within each customer. Our existing customer base has been more than doubling with us year over year, so a big chunk of our growth comes from existing customers, and that is where the efficiency gains compound.

How do customers evaluate VAST? From a non-technical perspective, you seem to overlap with a lot of products across different areas. What are they evaluating against, other vendors, internal builds, and what criteria drive those decisions?

The only ones that could potentially build internally are the massive AI labs. One of them started before we existed and tried to build its own system. Once it realized how difficult that is and that we existed, it shifted to us. This is a massive undertaking. We have been working on it for ten years with more than 700 developers. Not many organizations can do that. Even the big clouds are talking to us about adopting our stack because they realize how large an undertaking it is and that the market is here now.

In terms of other options, we are competing against the old stack, old storage companies, old database companies, old compute orchestration companies. If someone is running old applications, the old stack can get them by. If you are running business intelligence or big data workloads, it will work, though it will not be efficient, fast, or truly scalable. Whenever anyone starts to run AI workloads, it breaks down, and that is when they need us. After they start using us for new workloads, they realize they want fast access to all their historical data too. They find that we provide the same interfaces as the old stack, just a better architecture. So they can run old applications on top of us as well. That is when they start migrating old workloads, and we become a bridge between the old world and the new.

Would you push back against a taxonomy that lumps VAST in with companies like Weka, DDN, Pure Storage, or Dell?

Those are two different sub-categories to characterize the legacy stack. Two of those are old HPC companies. They build parallel file systems that were great for supercomputers, where you had dozens of PhDs turning knobs and tuning a machine for one scientific simulation run at a time, fast and relatively scalable, but requiring downtime for hardware problems or upgrades, very difficult to use, and designed to work on one project at a time.

The other two companies you mention build enterprise storage solutions for the old IT stack. They are not as fast or scalable as the HPC systems, but resilient, easy to use, secure, and compliant with regulation.

The new AI stack needs something far more scalable and performant than what HPC ever was, while still being resilient, easy to use, and secure, because enterprises need to put these workloads into production. Our architecture allows us to break those trade-offs and sit at that intersection. That is why we are being adopted for the new stack while those companies still serve old workloads. But that is just the storage layer.

Given the capabilities we’ve built into the AI OS, the same goes for the database layer and we compete with the old stack broadly. What you mentioned are just subcategories of it.

Is there a coopetition dynamic with Databricks and Snowflake, or are they part of the old stack too?

You can use our platform with whatever interface you prefer. If you want to use an S3 interface and treat us as an object store, you can put Databricks on top.

The more fundamental story: traditional data warehouse competitors have a lot of functionality we do not have, but if you need to store a trillion rows of vector embeddings or query at massive parallelism, they do not scale to that level, because they were built on the old architecture and could not leverage the underlying components we used starting in 2016, 2017, 2018. If you are running an AI workload, you need our VAST DataBase. Once you start using it, you realize you can probably put your old data warehouse on it too. We are adding more features over time such that you will not need the old stack anymore. That is the dynamic.

The memory price surge is getting a lot of attention. As an all-flash company, does that affect you?

It affects everybody. It does not affect our cost structure directly because we sell software. Our customers buy their hardware separately. But it affects them significantly: they cannot get their hands on enough capacity and the hardware they do find is becoming much more expensive.

We are actually a benefit in that environment, because everything we built into the platform early on was around efficiency. Using VAST software, you can store ten times more on the same hardware than you could with any other system. We built our own data reduction mechanism, our own efficient data protection mechanism, and our own data placement mechanism that allows you to use low-cost flash instead of high-end enterprise flash that is now even more expensive. We designed a program called Amplify, which allows customers to rip out existing software from hardware they already own and put us on top to get more out of it, because new flash is so hard to find.

As for why this is happening. Agents are essentially artificial people, some virtual, some physical in the form of robots, cars, and drones. They have experiences: they see things, hear things, scour the internet. Everything they see, read, and hear needs to be saved so we can understand what went into their fine-tuning, what information they used to make decisions. All that data that did not need to be saved before now does. Agents are also generating information, AI coders generating software, AI filmmakers generating video, multiplied by millions, then billions, then trillions of agents. It is three compounding exponents of data that has to be stored forever with very fast access. That is why flash prices went up 8x in six months.

Today we are attached to millions of GPUs. If you do the math, it is more than 90% of available GPUs that we are attached to. As we triple year over year, this year we will account for more than a third of the overall data center NAND market. The market is growing very fast, and we are gaining share at a rate I do not see from anyone else.

How should we think about the neocloud market evolving? Are you seeing convergence? Are they building differently from each other?

Two years ago there were three big clouds. Now there are roughly 100 AI clouds alongside those three. They exist because the big clouds became so large that they could not build the new stack fast enough, so the neoclouds built it instead. In some cases, the big clouds are actually renting capacity from neoclouds for that reason.

There are definitely too many for the market to sustain right now and there likely will be some convergence. Some small ones may become big; others will get acquired. Anyone above a certain scale has standardized on VAST, because when you are tiny, the old stack still seems to work. Once you reach that certain scale, it breaks down and they move to us.

A lot of them are built differently. Some are focused on larger customers; others on the long tail. Some build bespoke solutions; others are building cloud-as-a-service with multi-tenant offerings. It has been interesting to watch over the last couple of years.

If consolidation happens, does it matter much to you as long as the lion's share of capacity standardizes on VAST?

We care about the end–users. The AI clouds are a channel to end–users. If we are working directly with the model builders and enterprises on the other side, and there is consolidation or one AI cloud goes out of business, those customers will shift to a different AI cloud, a hyperscaler, or go on-prem. Because they built on top of our system, they will continue to use our system wherever they end up.

Thinking about a future with billions of agents interacting with, generating, and holding data, you mentioned a 1,000% increase in demand coming. Is VAST prepared for that world? What still needs to be built?

It is a lot more than 1,000%. Consider software development. A project that would take 200 people two years. You could put 200 agents on it and they would be ten times faster, so two months. But that is not fast enough. To get it done in five minutes, you need a million agents. And I want a feature film developed for me in a few minutes, not over a couple of quarters, so I need a million filmmaker agents. The numbers are staggering. It will not be one piece of software developed for a million people; it will be one piece of software developed for one person. We are talking many orders of magnitude larger as this plays out.

In terms of what we still need to build. A lot. Imagine robots walking around and doing things. The operating system needs to enforce policy: that robot is not allowed to pick up that hammer; no robot is ever allowed to harm a human. These policies need to be enforced such that even if a robot is compromised by malware, it still cannot do what it is not allowed to do. An investment agent must not be allowed to spend more than the budget it has been given. The operating system needs to be within the data path of every action these agents take, which means being embedded in the device, not just in data centers.

We also need a way to observe what agents are doing. Every piece of agentic communication needs to be stored and queryable. Why did that agent give that response two months ago? Why did it approve one mortgage and decline another? You need to know what model it was running and all the information it had available at that point in time. You need a snapshot from two months ago and the ability to recreate the reasoning behind that decision.

We also need access control at the agent level. My personal agent should know everything I know. My travel agent should know my calendar and that I prefer window seats, but it should not know my Social Security number. What each agent can read, and what can be communicated between agents, all needs to be defined and enforced. And as we begin fine-tuning individual models, because today everyone is on the latest version of a foundation model, but tomorrow every agent will have its own model fine-tuned based on its experiences. Everything that goes into that fine-tuning can potentially be leaked through interaction with the agent. We need to understand what is allowed to be shared and what is not. All of that needs to be built as part of this operating system.

Your post mentioned one or more VAST clusters deployed in space by the end of the year. Can you say more about that?

That was written by Jeff, my co-founder. We do have customers, including SpaceX, and some others working on space-related applications. My guess is that some of them are planning to put our system into space, but I do not have the specifics.

Doubling back to a customer persona we have not fully covered, Sovereign AI. How meaningful is that relative to the broader commercial opportunity?

A lot of people frame sovereignty as a question of location, so where the data sits. In reality, it’s an architectural problem. You need systems that enforce policy, access control, and isolation at the data layer, regardless of where the infrastructure runs. That’s where we fit naturally.

Our platform allows organisations to run a single operating model across sovereign environments, on-prem, in-country clouds, or hybrid, while maintaining strict control over data access, lineage, and compliance.

Everyone is concerned about maintaining independence in AI, and that is driving a lot of this behavior. It is a significant part of this new wave.

Commercially, what matters more to you right now, expanding your GPU footprint horizontally, or moving up the value stack with customers adopting more of the Data Engine and higher-level services?

I am greedy, so I will say both. We have geographic expansion, deeper penetration within existing accounts with more data under management and more workloads, new accounts, and moving up the stack with more software infrastructure services. We are doing all of those in parallel. There is no time to serialize them because our customers keep asking for more things, and we want to say yes and build it for them so they do not have to build it themselves.

Every technology revolution has had new hardware and a killer app. The PC had Intel processors and the word processor. Phones had cellular service. Then the operating system came. Windows made a thousand things possible on the PC, iOS made a thousand things possible on the phone. We are at that moment in AI. We have the GPUs, we have the TPUs, we have chatbots, we have the beginning of AI software development. Everything else still needs to happen, and we need to enable it through the operating system layer. There is no time to wait, and the fact that we are the ones doing this without seeing others in this space yet means we can grow very efficiently.

Last question: given everything you have described, why is VAST relatively under-discussed compared to companies like OpenAI, Anthropic, or even Databricks? It does not have the same household name recognition.

It is our style. We focus on our customers more than on advertising and getting our name out there. We do not need 100,000 customers. A lot of this is face-to-face interaction and word of mouth. We like to be under the radar. The model companies should have the limelight; we are the picks and shovels. We do not have the gold; we are just enabling those who do. That is the part of the stack we occupy, and it suits our style of being a little more introverted. The business is booming.

Disclaimers

This transcript is for information purposes only and does not constitute advice of any type or trade recommendation and should not form the basis of any investment decision. Sacra accepts no liability for the transcript or for any errors, omissions or inaccuracies in respect of it. The views of the experts expressed in the transcript are those of the experts and they are not endorsed by, nor do they represent the opinion of Sacra. Sacra reserves all copyright, intellectual property rights in the transcript. Any modification, copying, displaying, distributing, transmitting, publishing, licensing, creating derivative works from, or selling any transcript is strictly prohibited.

Read more from

Moove revenue, growth, and valuation

lightningbolt_icon Unlocked Report
Continue Reading

Waymo revenue, growth, and valuation

lightningbolt_icon Unlocked Report
Continue Reading