Background
Matt Sornson is the former CEO of Clearbit. We talked to Matt to better understand Clearbit's competitive positioning with companies like Apollo and Segment, Clearbit's 5-year vision for transitioning from data company to workflow company, and how Clearbit's trajectory has been altered with the rise over the past few years of product-led growth and the modern data stack.
Questions
- What was behind the initial product-market fit of Clearbit? What made it different or better compared to incumbents?
- Can you talk a little about the world pre-Clearbit and what it looked like for sales/marketing teams that wanted to enrich their customer data?
- Can you talk about how you were collecting all this data differently from a company like ZoomInfo?
- How important was the proprietariness of Clearbit’s firmographic data set key to initial PMF? How true is that still today and to what degree has the data aspect of it been commoditized or not and why?
- Clearbit has gone all-in on the “MQL” model enriching prospects with information about company size, revenue, sector, etc.—that’s in contrast to the “PQL” model of more recent, trendy tools that look at a user’s engagement behavior to determine their readiness to be sold. How do you think about these two models—are they complementary, is one or the other better for companies at different stages, or are they for different internal teams? Does not touching PQLs limit the upside potential of Clearbit’s growth given the rise of PLG?
- Can you talk about Clearbit’s customers and how your revenue mix breaks down between startups/SMBs/enterprise?
- Where are you most looking to expand in terms of the types of companies that you serve?
- How much of what Clearbit is doing today with the platform is about trying to productize some of the use cases that companies have discovered using the API over the years?
- What do you see Clearbit becoming when it “grows up”—an API business or a platform business? If it’s both, then why—how might the API side be additive to the platform side and vice versa? What benefits do you get from vertical integration? What additional upside does the Platform provide?
- Can you walk us through your pricing, unit economics and margin profile for the API and the platform?
- What changed after the failure of Rapleaf that opened up the opportunity for Clearbit?
- What leads you to believe you have the core competency to transition from a data company to a workflow company?
- Do you see Clearbit potentially displacing any other tools as part of that—like Segment on the data side, or something like Hubspot on the application side?
- With the API product, Clearbit is Switzerland / neutral with respect to the CRM, outreach tool and sales & marketing tool generally that the customer uses. How do you think about navigating building both an API businesses that provides Clearbit data to other platforms and an in-house platform that might be competitive with some of those platforms? What are the risks and how do you mitigate them?
- Can you talk about the competitive set of companies like Apollo that are vertically integrated sales & marketing intelligence companies? How do you think about it being both data & this vs. a more pure vertical integrated play?
- How does the thesis that every B2B SaaS app will be built on the cloud data warehouse touch on Clearbit’s positioning in this ecosystem or expand Clearbit’s degrees of freedom, if at all?
- Fundamentally, the platform is a means of productizing the stuff that a sufficiently-motivated company could’ve built using Clearbit’s APIs and then making it possible to sync those into their various CRM and CDPs?
- How does the platform go beyond just product-izing the use cases that people came up with via the APIs to something that is generative of new value?
Interview
What was behind the initial product-market fit of Clearbit? What made it different or better compared to incumbents?
To be completely frank with you, what made us different was that we had an API that 1) you could sign up for with a credit card, and 2) was significantly easier to use than any of the other products out there.
REST APIs were a newish thing then, and we built five nice ones: a person API, a company API, a watchlist API, a risk API, and a logo API.
The initial idea for the company was data APIs for developers: so we’d get every different kind of company data, we'll build a really nice API on top of it, and sell that to developers.
Can you talk a little about the world pre-Clearbit and what it looked like for sales/marketing teams that wanted to enrich their customer data?
The joke we had is that the pre-Clearbit world of enrichment was really just a bunch of people selling CSVs.
At that time, for example, ZoomInfo sold a lot of CSVs—but they didn't have a Salesforce integration, didn't have a CRM integration, and didn't have a marketing automation integration. They sold PDFs, and that was what the ‘data’ business was.
You can even go as far back as Dun & Bradstreet—they started as a company of couriers and lawyers in different cities, writing notes down in a notebook, and copying those notebooks. That was the origin of the Dun & Bradstreet product.
When we started, there were a handful of people offering built-in integrations. There were the FullContacts of the world. There was a handful of companies in the B2C space that were a little bit more integrated in the DMPs. But on the B2B side, it was still a lot of manual entry, and a lot of CSV work: you had to buy a CSV, clean it, de-dupe it, and then upload it into Salesforce.
That led to the second phase of Clearbit’s product market fit, which was when we launched no-code or low-code data integrations.
We launched that because of the trend of people taking our APIs and then trying to pump that data into Salesforce, or into Marketo, or onto their website, wherever they were using it.
We followed that trend, and built integrations for those people. So, you had plug and play integrations, and the thing that was different is we had a really nice real time API. On the back end, we were able to enrich on the fly. If you added a new record to Salesforce, it was enriched immediately. The rest of the market would do that once a day via a cron job.
Can you talk about how you were collecting all this data differently from a company like ZoomInfo?
We did it very differently. We weren’t going to go hire researchers—we wanted to get as much of this as we could programmatically. The way we built that loop is we went out and built a system that could take a domain name, or an email address, and then go hit every single public API or public service we could find to pull data back.
So, everything from Google Search results to about.me, to GOIP data. We just found public APIs or APIs that we could pay for, or places we could scrape. We could go in real time and build a profile about that person and company. At one point there were about 215 different data sources, and as the company got bigger, we built our own data assets and data sources that were a little bit more reliable, but also defensible, and used less and less of the smaller public ones.
How important was the proprietariness of Clearbit’s firmographic data set key to initial PMF? How true is that still today and to what degree has the data aspect of it been commoditized or not and why?
I think it's important to know that person and contact data is not proprietary. It's email addresses, phone numbers, number of employees, location—most of it, almost all of it is public data or public facts. It really is a coverage question more than a proprietary question. Which provider has the most amount of whatever field you care the most about. But there's no moat there except time. Finding this data is very, very easy, and hundreds of companies have done it.
Our differentiator in the beginning, is that we could do it on any company, no matter the size, because we started with the website, then hit all these public sources, and could do it in a couple seconds. You could give us a domain name we've never seen before, two or three seconds later, we have a profile about that company. That's how our network built out. If we hadn't seen something, we could go find it. That was the tech and the differentiator. Which made us really good at long-tail and smaller companies, when the incumbents weren't, because they were focused on the Fortune 500 or 5,000.
Over time, they got better at the long-tail. Everyone did. Web scraping became more of a thing, APIs evolved, the technology evolved. Salesforce made their libraries much easier to work with and integrate with. Today, I would say, in person and account data, there's very little differentiation left. The place where there's proprietary data today is around insights and signals. Not who is this, not the facts of what this account is and who this person is, but what have they done recently? Do I have any signal that they're in market? Do I have any signal that they're not in market? And that comes down to what's changing on their website? What's in the news? What websites have they visited? Have they visited my website? And the Reveal product is our biggest differentiator in that world today.
Clearbit has gone all-in on the “MQL” model enriching prospects with information about company size, revenue, sector, etc.—that’s in contrast to the “PQL” model of more recent, trendy tools that look at a user’s engagement behavior to determine their readiness to be sold. How do you think about these two models—are they complementary, is one or the other better for companies at different stages, or are they for different internal teams? Does not touching PQLs limit the upside potential of Clearbit’s growth given the rise of PLG?
It's not the way we pitch it, but it definitely can. The thing that we do for marketing teams more than for sales teams, is we help them generate demand. That comes from before signing up for the product. Everything until they fill out a form, or until they create that free trial, we don't target that many leads. We don't have that many conversion-based marketers that are fitting your PLG profile today. We have a lot of those teams that use this for demand, but we're not trying to build which person your sales rep should sell to right now.
Can you talk about Clearbit’s customers and how your revenue mix breaks down between startups/SMBs/enterprise?
Our biggest segment is our growth segment, I think it's like 40 or 50% of customers, and that's mid-market. We have about sub-10% enterprise, and the rest is startup. We don't break it into the four categories, we kind of have growth as our middle category. We have a lot of technology startups, as you would guess. A lot of B2B SaaS. Not that much outside of those worlds, other B2B, but B2B SaaS is the big one.
Where are you most looking to expand in terms of the types of companies that you serve?
More internet companies. Media consumer companies, absolutely. Those are all becoming use cases. As we build more of the front end marketing execution tooling we're seeing a lot more people come in. And we're being pulled up-market as well. I would say any enterprise or larger company that has a data science team, that's very much the center of gravity that's pulling us up, as data science teams become a bigger and bigger part of those organizations.
How much of what Clearbit is doing today with the platform is about trying to productize some of the use cases that companies have discovered using the API over the years?
I've used it on a few pitch decks, and that’s how we talk about the company at large. But we built APIs in the beginning for developers. We turned it to product people and growth people, and we were incredibly lucky in the people that found us and wanted that, at that moment, were the cutting edge of go-to-market marketing rev-ops. They built some incredible things. Every single product we've built has been inspired by watching people use the primitives that we previously built. It wasn't a conscious decision, but about two years in, we realized the type of business we were building, which honestly looks a lot like AWS in some ways, where you start with primitives, you see what value people get on top of those primitives, and you build your way out.
We started with the data points of the data APIs, the primitive, and it became integrations, and then it became extensions. We built extensions for sales and CS, they're Chrome extensions. It was like, ‘How do we lead this data into more people's hands?’ And then the platform is like, ‘How do we help the companies that can't build it themselves get it everywhere they need it?’ And what that means is how do we make it one place where they can do all their configuration for sales automation, marketing automation, advertising automation, advertising audiences, and build things like Guillaume's Reveal loop as a product.
What do you see Clearbit becoming when it “grows up”—an API business or a platform business? If it’s both, then why—how might the API side be additive to the platform side and vice versa? What benefits do you get from vertical integration? What additional upside does the Platform provide?
I think it stays core. It's something we’ve thought a lot about, because there is some inherent tension between the two. If you have APIs that people can use, some of those use cases, they might be able to do them cheaper, directly, than we would charge them on the platform. There are ways to game the pricing mechanism, etc. But we decided we didn't want to lose the innovation edge, which was having so many amazing people building cool stuff on top of it. It gives us insight into what's possible, but also what's working.
And actually, the platform, really, opens up a whole new set of things you can build on top of the APIs, because there's platform APIs—I think will come out officially in Q4 here—which lets you pull data from the platform and use that in other places. What that really lets you do is pull data from your CRM data combined with Clearbit data, with insight data, and now you can pull the computed or aggregate schema out to use it somewhere else. We're really just increasing the surface area of APIs, but helping do a little bit of data transformation along the way.
Can you walk us through your pricing, unit economics and margin profile for the API and the platform?
For margin, I don't have the exact percentage, but it's like 80 to 83%. The cost of data is relatively low.
What changed after the failure of Rapleaf that opened up the opportunity for Clearbit?
Honestly, very little. RapLeaf got in trouble for doing some shady shit. What changed was the proliferation of public data and the expectation of that data is fair game to be used. People's Twitter profiles, and LinkedIn bio description, all of that. The expectation today is that information, once it's out there, people have access to it, and can see it. That has changed a little bit over time. But I actually don't think there was any market shift between RapLeaf's failure and our coming to life. Actually, I think that that happened the year after we started.
What leads you to believe you have the core competency to transition from a data company to a workflow company?
The fundamental difference between building the application layer the way we’ve done it, versus the way that many others have done it, is we think about all the Clearbit data, and the universe of data, as existing before your data does.
You buy a HubSpot or you buy a CRM, you buy whatever, it's an empty database, right? You slowly start filling it up with your company data.
Ours is the opposite. You show up and integrate the platform, and you're plugging into the universe of available business data. It's not empty when you get in there. You could’ve not even plugged in your CRM yet, and it's not empty. You can create audiences, you can create workflow automation, you can connect your website and connect your CRM, and you have everything instantly. You're not waiting to integrate some other data set. You're not doing the clean and de-dupe. It's flipping the sales and marketing database model on its head. We're letting you integrate with our database, versus we're integrating with yours.
Do you see Clearbit potentially displacing any other tools as part of that—like Segment on the data side, or something like Hubspot on the application side?
It's a good question. The honest answer, and I don't know if it's appropriate for this, necessarily, but I see marketing and go-to-market as the next place where composable and headless become a thing. Where all of your marketing experiences, your marketing campaigns, your landing pages—which is, I feel like marketing sites are a core part of your marketing activity—all become these completely customizable experiences that you can build, that you can pull data from tons of different sources.
It doesn't really matter if your data is in HubSpot, or Salesforce, or Snowflake, or Clearbit, you're able to pull the bits of it you need from each place to run your campaign. GraphQL and headless is making that a technical possibility. I think that's the future, that's where Clearbit ends up is it's like, your composable logic layer. That's really what it is today, with a few marketing execution tools on the end of it. But I wouldn't be surprised if that's where we end up. Will that eat some stuff? Yes, I think that probably eats email, it probably eats website personalization tools, or could. I think it could eat the email marketing use cases, the web forms use case for sure. The website personalization use cases, because it just becomes a better system of record to do that. And I don't think those platforms can technically adapt to that new future.
With the API product, Clearbit is Switzerland / neutral with respect to the CRM, outreach tool and sales & marketing tool generally that the customer uses. How do you think about navigating building both an API businesses that provides Clearbit data to other platforms and an in-house platform that might be competitive with some of those platforms? What are the risks and how do you mitigate them?
It's one of those necessary evil/costs of success as you build an infrastructure tool, and also application tools. Apple, Amazon, or AWS, Apple app store, all of these are the consummate examples of that. Not putting us in their same echelon, but as you find out what people get a lot of value out of from the infrastructure side, you build applications to suit that. And part of the way you learn that is through people building on top of you.
Can you talk about the competitive set of companies like Apollo that are vertically integrated sales & marketing intelligence companies? How do you think about it being both data & this vs. a more pure vertical integrated play?
We think of that as farther up the stack. Maybe we'll eventually be there, but that's not where we play today. We started with APIs, the core fundamentals, and we built these no-code integrations, then we built these extensions, and we built this data management and automation platform for creating these data loops. But we're really still in growth marketing, data engineering, and demand marketing land, where they're building ongoing campaigns and loops. Things like Apollo, primarily, are for individual reps to go and build a list and email them. It's like being closer to the frontline salesperson, and we're still below them. Marketing teams don't use Apollo much. I'm sure there are use cases, but that's not the way that the market thinks about it today.
Part of the reason we have such a big sales mind-space is we got almost a million sales reps to install our extension, which was a way to get our data in front of people, but also a data acquisition tool for us. But that did make a lot of people think of us as a sales tool.
How does the thesis that every B2B SaaS app will be built on the cloud data warehouse touch on Clearbit’s positioning in this ecosystem or expand Clearbit’s degrees of freedom, if at all?
No one cares about where the data is, everyone wants to drive outcome. The outcome they want to drive, on the marketing side, is they need audiences and targeting that works. And then, maybe it's differentiated from their competitors. Picking who you talk to is probably the biggest thing you can do, in terms of the differentiation on the marketing side. They want to drive good go-to-market front-end decisions all the way down.
That's what the company cares about more than anything. They don't really care about your data warehouse, your data orchestration, your ETL. If you look at the highest, fastest growing companies, their ETL, reverse ETL, is a fucking mess, right? They're using Census, they're using Airbyte. They're moving things around. They're using DBT to try to stay clean in the data warehouse, but struggling to do so, because the DBT model hasn't been updated, or someone put it in a different table, or a different warehouse.
The level of steps today between the person running campaigns and the data entering the system is just way too much. The data warehouse has opened an opportunity for a simpler world, and we're slowly moving towards it. It's better than the old world where we had no idea who was in our advertising audiences, and how they related to who was in our CRM. Like that was two years ago. We're slowly getting better. But it's still a bit of a mess. The go-to-market, especially smaller company go-to-market has not solved the ‘we have too much data’ problem yet. We very much still live in that ‘we have too much data and don't know what to do with it’ world.
The one future-looking thing is, the rate of change, the rate of introduction of new channels, and of new types of marketing and sales activity, continues to increase. The dominant systems today are not built to be all that flexible for the inputs and outputs. Think of, like, a HubSpot—if now, 30% of a company’s demand budget is going into influencer campaigns, HubSpot's not really equipped to do anything for you there. As the inputs to your data management system change more and more rapidly, the systems become more and more brittle, or less able to keep up. This has very little to do with work there is today about that, this is just my future market thoughts. A lot of the tools are band-aids on top of that.
Zapier is probably the best example out there. Yes, Census, Hightouch, RudderStack—like all of these guys that are building newer ways to move data around. At some point, it's going to be abstracted even more so, where it's not like, ‘if this thing happens in this system, do this in the other system.’ It's going to be, when anything happens in the map of my business, like a new record is added anywhere, pull that data through, and I can decide. New piece of data comes in to my middle layer, the rules that happen there have nothing to do with the previous tool, they just have to do with the new data. It's not like a new form submit, it's like a new lead record. And a new lead record could be made by six or seven other different workflows.
Fundamentally, the platform is a means of productizing the stuff that a sufficiently-motivated company could’ve built using Clearbit’s APIs and then making it possible to sync those into their various CRM and CDPs?
It's product-izing the things that people were previously doing with the APIs. ICP reporting generation one, creating ad audiences by combining Clearbit data with your personal data. Two, doing sales alerting. So a high value customer or lead hits your website, alerts sales, automate the email. Guillaume's post, which was the Reveal loop, I think is what he called it, is a product called Capture. What Capture does is takes everyone that visits your website and automatically creates accounts and contacts for you in Salesforce. It auto de-dupes against your data set, because we have a copy of your database, and it can do that for you. It lets you set the rules on what rules and seniorities you want for your contacts, all that. I think you hit it right on the head—everything people could do with APIs, we just slowly built out those use cases for more people to be able to do.
How does the platform go beyond just product-izing the use cases that people came up with via the APIs to something that is generative of new value?
The platform pulls in all of Salesforce, all of Marketo, in our case, all of Snowflake, and lets you see every single person across all those systems and every single account across all those systems. For us, there's half a million companies that have interacted with us. It's using our Reveal product to log all of the website data, so, when these companies visit, if their JavaScript is in product, all of their product page URLs as well. So it starts to build that full customer picture.
Then it lets you create these audiences, which our segmentation tool lets you see ‘this is a lost lead that is back on the website,’ so like, ‘let's re-engage them.’ What this lets you do is create these little leaps, or create these little fragments of logic, or full audiences and audience definitions, that then can be used in other places. For something like this, this might be something that you wanted to trigger an outreach campaign based on. The API lets you basically say, ‘is it in segment? Trigger this campaign.’ Those platform APIs let companies do that, and marketers do that, but also let other companies do that. So like a Drift or a Qualified, or whatever, can integrate Clearbit, and can use that segmentation that you created, and those audiences you created to run marketing plays, or run sales plays.
The other piece here is advertising. This lets you create audiences based on all of the data in your system of record, but also the entire Clearbit universe. So, you can do every marketer at a company over 50 people in the United States and Western Europe. That becomes an audience that's deduped against your data set, and then turned into an advertising audience. The platform lets you do a lot of interesting things with that. It could be a person we track here as someone converts from that audience and becomes a lead in your system. That's an event that you could be pulling from the API to trigger things. It's really opening up APIs so you can start taking action based on changes within your data loop, all the data within your system.
Disclaimers
This transcript is for information purposes only and does not constitute advice of any type or trade recommendation and should not form the basis of any investment decision. Sacra accepts no liability for the transcript or for any errors, omissions or inaccuracies in respect of it. The views of the experts expressed in the transcript are those of the experts and they are not endorsed by, nor do they represent the opinion of Sacra. Sacra reserves all copyright, intellectual property rights in the transcript. Any modification, copying, displaying, distributing, transmitting, publishing, licensing, creating derivative works from, or selling any transcript is strictly prohibited.