Zach Lloyd, CEO of Warp, on the 3 phases of AI coding

Background

We’ve covered AI coding from the early days of AI-in-the-IDE with GitHub Copilot & Cursor ($500 ARR in May 2025) to the insurgent rise of command line coding apps like Claude Code ($400M ARR in July 2025, up from $17.5M in April).

To learn more about the evolution of AI coding and where it’s going, we reached out to Zach Lloyd, the co-founder & CEO of the AI terminal & agentic coding company Warp ($17M Series A, Dylan Field).

Key points from our conversation via Sacra AI:

Software development has moved away from the IDE as the “Microsoft Word for code” toward AI-assisted IDEs, starting as AI-in-the-IDE via tab autocomplete in GitHub Copilot (2021) and then getting popularized as the AI-IDE via chat panels in Cursor ($500M ARR in May 2025) & Windsurf ($82M ARR in July 2025 when acquired by agent company Cognition). “There are three phases to AI coding and AI development. The first is autocomplete. Copilot was there first. Cursor did it better than Copilot and has this very magical experience where you open a file, place your cursor, maybe type a few characters, and it reads your mind and completes the function. That found strong product market fit about a year ago.”
As AI assistance has progressed from code generation & handling one-shot tasks to planning & executing on multi-step workflows, human developers have moved upstream in the development process to tech-lead orchestration, with interfaces flipping to an agent-first workflow of chat in the terminal (Claude Code, OpenAI’s Codex, Warp) and code review of agent work behind that in the diff/code-review pane. ”Instead of doing development by hand, where you open an IDE, find the files, and type code, you prompt an agent to do the bulk of the coding. You provide context. The interface is not so much the IDE. It is the terminal, Warp, or a chat panel, a place where you tell the computer what to do. . . For pro developers, it is still early. Most have not fully adopted a workflow that starts with a prompt.”
Fully autonomous agents promise to operate not just across your code base but throughout your deployment, dev tools, and project management SaaS, spinning up on CI failures (CircleCI), production errors (Sentry), or new tickets (Linear) and building features and fixing issues without interactive prompting and human intervention. "We want to be the layer where you orchestrate development agents. The terminal is the best place because it is the easiest place to integrate data from many systems and have agents react to events. That is what we are going for. It will take time, but that is the vision."

Questions

In short, what is Warp?
Can you give us a brief history of AI coding in general—starting with Copilot, tab-complete, and tools for writing larger chunks of code—and how those capabilities have evolved?
Run us through a brief history of agentic AI development with Devin, Claude Code, Codex, and Warp. Devin started with a hands-off agent via Slack; Claude Code and Codex are more command-line based—how has this evolved? Bringing us to now, how do you slot Warp into that?
Can you say a little more about Warp CLI and how that works?
And Claude Code SDK is similarly a way to abstract Claude Code off your single machine.
Thinking about that vision, does it still make sense to compare Warp to terminals like Ghostty?
GitHub Copilot started inside the IDE; Cursor and Windsurf forked the IDE and became the IDE; Devin brought a chat/Slack paradigm; now there’s the terminal/command-line paradigm. What do you see as the advantages of each, and do you expect them to converge or diverge?
Who is the core Warp customer today? What are the core use cases? And how much of this is vibe coders vs. professional developers?
Cursor has some of its own models; Claude Code is vertically integrated with Claude. How should we think about Warp’s approach here—closer to those paths or a different direction entirely?
How do you think about gross margin—where’s the opportunity to drive more margin over time (e.g., pricing, guardrails, routing), and what lessons have you taken from how AI-native IDEs handle this?
With IDEs, switching from Cursor to Windsurf is relatively easy. What are the defensible layers for a terminal—workflows, telemetry, data exhaust, distribution—that make it hard to copy?
What are the design considerations for making Warp “social”—are you seeing pull toward shared workspaces, collaborative debugging, or is the core value still a highly individualized developer tool? To what extent is Warp being used in multiplayer/team contexts versus as a single-player terminal?
Software engineering has changed a lot in the last few years. Looking 1, 2, and 3 years ahead, how will it be different?
How do you think about bringing end-to-end testing, debugging inside the IDE/terminal, and automating PRs/code review? Are we heading towards all of this being integrated into an all-in-one engineering environment?
Warp is adjacent to tools like Sentry and other in-app instrumentation. How do you think about integrating with—or competing against—observability and experimentation stacks?
You mentioned MCP. I saw someone say the most popular use of MCP right now is building MCP servers. What does it take to get this protocol into broader use?
Windsurf’s journey surfaced questions about independence in dev tools. What room is there for independent tools, and how does that help Warp’s positioning?
A lot of project/task management lives in Linear and is fed as context into development. Is there a world where Warp “eats” pieces of Linear so that project work and management live inside Warp?

Interview

In short, what is Warp?

Warp is what we call an agentic development environment, which we believe is the best way to code with agents.

Warp was originally a terminal. The genesis in 2020 was building a more usable terminal. More generally, it is an interface for telling your computer to do things, and the computer does them for you. You can do that in the language of terminal commands. Or you can do it in English or natural language.

If you use natural language, instead of running a terminal program, it launches an agent. Agents in Warp can write code, set up projects, deploy to production, help figure out why production servers are crashing, and use the filesystem, Git, and Docker. It is a wide range of things because of the layer we are at, which is the terminal layer.

Can you give us a brief history of AI coding in general—starting with Copilot, tab-complete, and tools for writing larger chunks of code—and how those capabilities have evolved?

There are three phases to AI coding and AI development. The first is autocomplete. Copilot was there first. Cursor did it better than Copilot and has this very magical experience where you open a file, place your cursor, maybe type a few characters, and it reads your mind and completes the function. That found strong product market fit about a year ago.

The second phase is interactive agents. This is Warp and Claude Code. Instead of doing development by hand, where you open an IDE, find the files, and type code, you prompt an agent to do the bulk of the coding. You provide context. The interface is not so much the IDE. It is the terminal, Warp, or a chat panel, a place where you tell the computer what to do. That is the second phase, agentic development. We are very much in that phase. For pro developers, it is still early. Most have not fully adopted a workflow that starts with a prompt.

The third phase is more automated development. You see this with Cognition, the Claude Code SDK, or Warp with Warp CLI. An agent can fully automate some aspect of the development process. You see it first on lower complexity but impactful tasks, a good coding agent can one shot. Things like cleaning up dead code, automated linting, and removing outdated feature flags. All of that is slowly being automated. Those are the three phases.

Run us through a brief history of agentic AI development with Devin, Claude Code, Codex, and Warp. Devin started with a hands-off agent via Slack; Claude Code and Codex are more command-line based—how has this evolved? Bringing us to now, how do you slot Warp into that?

I break it into interactive and automated. Interactive means my workflow as a developer is to sit down at the tool, the terminal or Warp, type a prompt, provide context, kick off an agent, and steer it. I might go into my editor or use hand editing features to get the change over the line.

The other kind of agent is more interesting to me. You are not babysitting it. There is a background process. It could be an agent going over your codebase looking for bugs and trying to fix them. It could be going through your backlog. It could be a crash where you have set something so that when a crash occurs, an agent does the initial investigation. That is the bifurcation. A system event triggers something one shottable, or a developer sits there and does it.

For automated and non interactive agents, there are two approaches. One is Devin’s positioning, an AI software engineer, another member of your team. The other is what Warp is doing, closer to the Claude Code SDK. We provide programming tools for developers to build agentic systems that do part of your development for you. You program something in your CI pipeline that runs an agent when something happens.

Can you say a little more about Warp CLI and how that works?

The way to think about Warp CLI is that it lets you run an agent without using Warp’s GUI app. Today, putting the CLI aside, Warp’s agents run through a desktop app on your computer. You open it, type something, and the agent starts. The limitation is you have to be at your machine.

With Warp CLI, you can ask an agent to do something and it can run anywhere. It can run in your CI pipeline, on a production machine, or from Slack through a webhook. It does not need any UI. It needs a prompt. It still has the same context as a Warp user. It has your MCP servers, for folks who know those. It has external context sources. It can make coding changes. It can search your codebase. That is the full power of the agent. The big difference is that it is programmable. Does that make sense?

And Claude Code SDK is similarly a way to abstract Claude Code off your single machine.

Exactly. It is the same relationship. Running Claude Code in your terminal and typing into it in a repo, versus the SDK where you program against it. I am bullish on programming against agents because that is how you automate. You have a program that triggers the agent when something interesting happens.

Thinking about that vision, does it still make sense to compare Warp to terminals like Ghostty?

We do not call Warp a terminal anymore. Our product philosophy is to be the best place for developers to build software with agents. It looks like a terminal and it works as a terminal because the terminal has been the place where developers tell the computer what to do with commands. That is not our future.

We are building first class agent support from first principles. We are not trying to be in the terminal camp or the IDE camp. We are our own thing that supports agentic workflows.

GitHub Copilot started inside the IDE; Cursor and Windsurf forked the IDE and became the IDE; Devin brought a chat/Slack paradigm; now there’s the terminal/command-line paradigm. What do you see as the advantages of each, and do you expect them to converge or diverge?

I see convergence. In Warp’s approach, some IDE elements are useful for working with agents, but not the ones IDEs put front and center. The most useful view for coding with agents is a diff view, what you see in code review. Your primary job is to review the code the agent writes. You can find that in an IDE, but it is not the primary interface. IDEs are optimized for writing code by hand. It looks like Microsoft Word for code.

We also brought a file tree into Warp. It is not for lots of file operations. It is useful for a human coding with an agent to see all the files and the context, then give that context to the agent. We are inspired by the IDE in places, but we are not trying to rebuild it. A lot of things do not need to exist in an agents first world, or they should be buried.

Who is the core Warp customer today? What are the core use cases? And how much of this is vibe coders vs. professional developers?

Our core user and customer is a pro developer building software at work. That is who we want to build for. It is a better business than vibe coding. People also self-select into Warp because the terminal is traditionally a pro tool. There are easier on ramps like Lovable or Replit. The terminal is more intimidating, but it is the most powerful interface for agent work. We are focused on pros, but about 20 percent of paying customers are non technical. If you build something great for pros, others will leverage that power. If I had to pick one, it is the pro developer using Warp at work.

Cursor has some of its own models; Claude Code is vertically integrated with Claude. How should we think about Warp’s approach here—closer to those paths or a different direction entirely?

We explored fine tuning and did not find it more effective. Unlike Claude Code, we are multimodel. We are not locked into one provider. Right now we use Claude for latency sensitive tasks and o1 for higher reasoning tasks where you can wait. We try to use the best model for the task.

The harness is ours. How we prompt, how we get context in, when we summarize if we hit the context window, when we truncate. The most important thing is a measurement first approach. Offline evals, or evals from real world usage. These models are non deterministic, so you can think something is better when it is not.

How do you think about gross margin—where’s the opportunity to drive more margin over time (e.g., pricing, guardrails, routing), and what lessons have you taken from how AI-native IDEs handle this?

It is a challenge for us, and for most AI coding companies. In the short term, we need margins at a reasonable spot so we can grow. We are growing very fast. We are adding about a million in net new revenue every week. If margins are not good, that gets very expensive. We have to keep them in range.

In the long term, how good of business we can build at our layer depends a lot on competition at the layer below. If one model provider runs away with it, for example Anthropic, with a clearly better coding model, they will have pricing power, which is not good for us. If the market stays competitive, which I think is likely, there will be price competition. We already see o1 at a third of Sonnet’s price with comparable performance. We hope competition continues at that layer.

We can also align pricing so that as developers use Warp more, we make more. That is hard with fixed request SaaS. We would like more usage based pricing. We can make model usage more efficient. You do not need the most expensive model for every task, and high latency hurts the user experience. There are a lot of levers. I am bullish on a very big market for great coding tools. It is not my top concern, but it is something to watch.

With IDEs, switching from Cursor to Windsurf is relatively easy. What are the defensible layers for a terminal—workflows, telemetry, data exhaust, distribution—that make it hard to copy?

I like Warp’s position. We are one of one. Competitors are VS Code forks or chat panels in VS Code, or they are text based terminal apps. There is not another app like Warp with the terminal form factor and IDE style features. It was hard to build. We spent five years and went deep into OS integration. I do not know if it is a forever moat, but it is not easy to replicate. Relative to other tools, we are in a good spot.

What are the design considerations for making Warp “social”—are you seeing pull toward shared workspaces, collaborative debugging, or is the core value still a highly individualized developer tool? To what extent is Warp being used in multiplayer/team contexts versus as a single-player terminal?

Our original business model before AI was teamwork in the command line, and there is traction. The most interesting part now is shared context for AI. Not just human to human collaboration. Sharing MCP configurations, shared notebooks, shared commands, shared environment variables, sharing rules. That is cool for humans, and the superpower is that the agent can access it.

If your team is onboarded into Warp, someone can ask the agent to set up a new engineer’s development environment. That is very magical. It is a place we can differentiate and build stickiness. The more the agent knows your organization, the more annoying it is to switch.

Software engineering has changed a lot in the last few years. Looking 1, 2, and 3 years ahead, how will it be different?

A few things. One, there is a shift from developing by hand to developing by prompt. We are still early, especially for pros. Over the next year it becomes more common to start by telling an agent to do part of the work, then iterate to completion.

At the same time, the more boring and one shottable parts of development get automated, which engineers will like. People worry that AI will take jobs. In professional settings it takes away the annoying parts.

It will be harder for junior engineers to get jobs. If your only skill is building web apps or simple mobile apps, I would be concerned. I would upskill fast, so I am more experienced and more competent than an agent.

In three years, a larger share of code is written by AI. Developers act more as tech leads for the AI, managing it and making sure the product gets built correctly.

How do you think about bringing end-to-end testing, debugging inside the IDE/terminal, and automating PRs/code review? Are we heading towards all of this being integrated into an all-in-one engineering environment?

That is our vision. We want to be the layer where you orchestrate development agents. The terminal is the best place because it is the easiest place to integrate data from many systems and have agents react to events. That is what we are going for. It will take time, but that is the vision.

Warp is adjacent to tools like Sentry and other in-app instrumentation. How do you think about integrating with—or competing against—observability and experimentation stacks?

It will be a mix. For Sentry, we are not going to rebuild crash reporting. An integration makes sense. In Warp you can already do it through MCP. It would be cool to tag Warp on something. Some integrations will live in our app. Some will be outside.

For some things, we will pull functionality into Warp. For example, code review. We now have something that looks like GitHub’s code review interface right in Warp. It belongs in Warp because the agent builds the change in Warp. You should not have to leave the tool with all the context. We will keep investing there where it makes sense.

You mentioned MCP. I saw someone say the most popular use of MCP right now is building MCP servers. What does it take to get this protocol into broader use?

It is funny. We see real MCP usage in Warp. People use third party servers like GitHub’s and Figma’s. The predominant tool in Warp is still CLI apps, which makes sense. Our original take when we saw MCP was that if there is already a CLI, there is not a strong reason to use MCP. LLMs are great at CLIs. They are text based, configurable, well documented, and fast. They generally just work. You might install them with Homebrew, but they work. So most tool usage is CLIs.

I am still bullish on MCP. The idea is not that deep. It is an API an agent can use to get information from external systems. Long term, I do not know what else would make more sense.

Windsurf’s journey surfaced questions about independence in dev tools. What room is there for independent tools, and how does that help Warp’s positioning?

Our goal is to grow as an independent company. People think the market is more mature than it is. It is early. No one has won. Six months ago autocomplete was driving traction. That has changed in the last six months. I do not know who will win, but I like our chances based on product, team, investors, and a big user base. Our goal is to keep growing. It is an exciting, impactful space, and I want to keep going.

A lot of project/task management lives in Linear and is fed as context into development. Is there a world where Warp “eats” pieces of Linear so that project work and management live inside Warp?

It is interesting. We are not building that now, but it might make sense because Warp is where you do task execution. Warp can take a task description, a prompt plus context, and do it. Tools like Linear are houses for task descriptions. You place them and shuffle them, then a human does the work and reports back.

In the near term we will integrate. We will likely do an integration with Linear, which we use. Conceptually, a task system is another way to specify a prompt and put it into a priority queue.

Disclaimers

This transcript is for information purposes only and does not constitute advice of any type or trade recommendation and should not form the basis of any investment decision. Sacra accepts no liability for the transcript or for any errors, omissions or inaccuracies in respect of it. The views of the experts expressed in the transcript are those of the experts and they are not endorsed by, nor do they represent the opinion of Sacra. Sacra reserves all copyright, intellectual property rights in the transcript. Any modification, copying, displaying, distributing, transmitting, publishing, licensing, creating derivative works from, or selling any transcript is strictly prohibited.

Zach Lloyd, CEO of Warp, on the 3 phases of AI coding

Background

Questions

Interview

Disclaimers

Read more from
#ai

$147M/year GarageBand for AI music production

Replit at $253M ARR growing 2,352% YoY

Video scarcity to video abundance

Read more from
#code-editors

Manus revenue, growth, and valuation

Cognition revenue, growth, and valuation

Finance & ops at Replit on AI-powered development platforms and the future of coding

Create a free account, or log in.

Free article limit reached.

Standard membership required.

Standard membership required.

Background

Questions

Interview

Disclaimers

Read more from #ai

$147M/year GarageBand for AI music production

Replit at $253M ARR growing 2,352% YoY

Video scarcity to video abundance

Read more from #code-editors

Manus revenue, growth, and valuation

Cognition revenue, growth, and valuation

Finance & ops at Replit on AI-powered development platforms and the future of coding

Read more from
#ai

Read more from
#code-editors