Operations at Whop on using Claude to ship product & automate ops

Background

We spoke with a head of ops at Whop about how non-engineers can ship production code daily using Claude and Cursor, runs automated partner-tracking and onboarding workflows through Claude Cowork, and thinks about where human accountability still has to own the final decision.

Key points via Sacra AI

A non-engineer who runs trust, safety, and support ships production code multiple times a day through Claude on Cursor, and the thing that unlocks it is deep knowledge of the company's data tables and models rather than coding ability, collapsing a multi-week roadmap process into about an hour for changes like KYC modals. "While I don't have a formal engineering background, Claude has enabled me to code properly. I can ship product just as any other engineer, because if you know the data tables and models, know the names of certain fields, and know exactly what to ask and where to ask it, you can ship most code. I've gotten feedback from our lead and founding engineers that some of my code is truly indistinguishable from that of classically trained engineers with many years of experience."
Cowork runs as an autonomous EA that consolidates Gmail and Slack into daily reports and action items, with workflows like a payments-partner request tracker updating a shared spreadsheet automatically at 8 AM every day once a one-week testing period confirms it catches everything. "That spreadsheet is now updated daily at 8 AM automatically by Cowork without any intervention on my end. It executed a hundred percent. I had to iterate a few times to clearly define the parameters. What I did was have it test initially for about a week, and then I went in manually to make sure it was capturing a hundred percent of the emails or Slack messages I knew had come in. Aside from that one small test, I am a hundred percent hands off."
The autonomy boundary is drawn by reversibility and accountability: internal, low-stakes actions like posting Slack summaries and updating trackers run automatically, but external partner emails and anything touching compliance, security, or money movement always keep a human in the loop because you can't hold an AI accountable. "Anything around security, compliance, or money movements — those are my big ones where I don't trust handing off fully to an AI, because the consequences are very serious. You want some level of human accountability, because at the end of the day you need someone to point a finger at, and you can't point a finger at an AI. Those more sensitive areas need to be fully owned by a human. Everything up until the final push or send, it can be the assist."

Questions

Which tools do you use regularly, and which have you tried between Claude Cowork and OpenAI's Codex app?
When did you first try Claude Cowork, how often are you using it now, and what kind of work were you hoping it would take over for you?
Could you walk me through the last few things you delegated to Claude Cowork end to end—like, what you asked it to do, what inputs it used, and what came back?
On the payments-partner spreadsheet, can you walk me through the setup a little more concretely—like, did you connect Gmail and Slack directly to Cowork, specify the partner domains and channels in a prompt, and then ask it to create or update a Google Sheet? How much of that workflow did it actually execute versus you copying things between tools?
When you say it created the spreadsheet, did Cowork decide the columns and trend summaries itself, or did you give it a pretty explicit schema? And how do you review whether those trends are actually right?
Can you walk me through the last time you used Claude to make an actual product or code change—what were you trying to change, how did you work with Claude, and how did it get reviewed or shipped?
For that KYC modal change, can you walk me through the workflow step by step—from noticing the modal needs changing, to Claude in Cursor making edits, to tests and checks, to PR review?
When you open the PR for something like that, what does the code owner actually review—are they treating it like any engineer's PR, or are there extra guardrails because you're a non-engineer using Claude?
When the code owner reviews one of your Claude/Cursor PRs, what kinds of issues do they most often catch—is it logic, edge cases, security/compliance, product wording, or just style?
Can you give me a concrete example of an edge case an engineer caught on one of your PRs—what did Claude miss, and how did that change how you prompt or review the next time?
When you're working on changes like payout or KYC flows, how do you decide what's safe for you to ship with Claude versus what needs an engineer to own from the start?
Can you walk me through how you learned enough of the data models and codebase to do this safely—was that mostly pre-existing company knowledge, or did Claude help you build that understanding over time?
For someone at Whop who doesn't have that same technical fluency, what would need to change in Claude Cowork or the surrounding process for them to safely contribute to product or ops workflows the way you do?
For your ops workflows, like the partner-requests spreadsheet or onboarding summaries, what's the review process after Cowork hands work back? What do you check before you trust it?
Once that spot check passes and the workflow is running daily, have you had any cases where Cowork later drifted or silently missed something—like a new partner domain, a weird Slack thread, or a message format it didn't classify correctly?
Can you walk me through one time Cowork did drift or disappoint you? What was the task, what did it get wrong, and how did you correct it?
When that happens, do you usually continue in the same Cowork thread and correct it, or do you restart with a cleaner prompt? And what signals tell you "okay, this task is no longer trustworthy"?
When you use Claude Cowork versus regular Claude in the browser or Claude inside Cursor, what feels different? What kinds of tasks feel natural to hand to Cowork specifically?
Can you walk me through what one of those Cowork "projects" looks like in practice? Like, what connectors does it have, what prompts or recurring tasks live inside it, and who besides you ever sees the outputs?
For those instructions and context you've loaded in, what do you find yourself repeatedly teaching Cowork about Whop, your teams, or your workflows? And what context do you wish it just remembered or understood better across projects?
Can you give me a concrete example of those standing instructions? Like, what are the metrics or "this is good / this is bad" guidance you include before asking Cowork to do an ops task?
When Cowork produces something that other people at Whop will use—like the partner trends sheet or an onboarding summary—how does it move through the team? Do people know it came from Cowork, and do they edit, challenge, or rely on it differently because of that?
Where does the handoff still feel clunky? Are you ever copying, reformatting, or fixing something manually after Cowork creates it?
What's the clearest value you'd point to from Cowork today: hours saved, fewer dropped balls, faster response to partners, better team visibility, or something else?
Walk me through one tedious email workflow that Cowork now handles almost end to end. What triggers it, what does Cowork draft or send, and where do you still stay in the loop?
When you review those drafted emails, what are you mostly checking for: factual accuracy, tone, policy/compliance wording, whether it understood the partner's request, or something else?
Can you give me one recent example of a drafted email where you changed something before sending—what did Cowork get slightly wrong, and what did you edit?
How have you tried to teach Cowork your voice? Like, do you give it examples of past emails, explicit style rules, or do you mostly correct drafts as you go?
For the scheduled workflows, what would make you comfortable letting Cowork actually send emails or take actions without your final review? Or is there a category where you'd always want to stay in the loop?
What about internal actions? Are there workflows where you'd let Cowork take the final step automatically—like posting a Slack summary, updating a tracker, assigning follow-ups, or creating tickets?
Can you walk me through one workflow where Cowork does take the final action automatically—say, posting a Slack summary or updating a tracker? What's the trigger, what exactly does it post or update, and who relies on it?
For those metric posts, is Cowork mostly pulling from existing dashboards and summarizing, or is it doing any analysis—like calling out anomalies, explaining why a metric moved, or recommending follow-ups?
When Cowork flags an anomaly or recommends a follow-up, how do people verify it? Like, does someone click through to the underlying dashboard, ask Cowork for sources, or just rely on the Slack summary unless something looks off?
When those Slack threads take off, does Cowork ever participate again—like answering follow-up questions, pulling more context, or creating action items—or is its role mostly to generate the initial post?
How much of this is your own power-user setup versus a shared Whop operating practice? Like, are teams standardizing around Cowork workflows, or is everyone building their own agents and automations?
Can you walk me through one of those sharing rituals? Like, when someone builds a useful Cowork workflow, how do they show it to others, and what does it take for another person to actually adopt it?
What would need to change for Cowork to move from "everyone has their own powerful setup" to more of a shared team workflow—like templates, permissions, audit trails, shared context, or something else?
Looking forward a year, what kinds of work do you expect you'll be delegating to Cowork or Claude that you still handle manually today?
What's one repeatable workflow you still do manually today, where Cowork is almost good enough but not quite? What's the missing piece that keeps you from handing it off?
For those sensitive workflows, what role would you trust Cowork to play today? Like, gathering evidence, drafting a decision memo, flagging anomalies, preparing a checklist?
On willingness to pay, how do you think about the value of Cowork today? Like, if you had to justify it internally, what would you point to as the business case?

Interview

Which tools do you use regularly, and which have you tried between Claude Cowork and OpenAI's Codex app?

I've tried both, but I'm pretty religiously a user of Claude Cowork.

When did you first try Claude Cowork, how often are you using it now, and what kind of work were you hoping it would take over for you?

I've been using Claude for a few months now. I started seriously using Cowork about two months ago. For context on my role: I run our trust and safety and customer support teams. As our company moves into the payment space, we have a ton of different payment partners. The original hope was that any sort of tedious work—anything that would be done more than once—could just be taken over entirely by Claude Cowork. That was the initial intention. I started using it to eliminate email work and consolidate communications coming in from our many partners across external communication platforms—whether that's Gmail or Slack—using external integrations to pull in that context and respond accordingly.

It's grown from consolidating communication channels into something more. Now, especially in managing multiple teams across multiple parts of our organization, I use it essentially as a personal assistant. It's like a little EA that scans all of my communication channels—primarily Slack and email—and gives me reports throughout the day, along with daily action items. It serves as a personal EA to manage all of those different channels.

Could you walk me through the last few things you delegated to Claude Cowork end to end—like, what you asked it to do, what inputs it used, and what came back?

Sure. Thinking about our many external partners on the payment side, I wanted to know, for the past month, what types of requests we were getting across all of our different partners. Those requests come in from Slack and via email. I specified that any emails with a certain subject line, subject matter, or coming from specific email domains—x.com, et cetera—should be captured. I defined who would be sending the messages I cared about. Then I asked Cowork to create a spreadsheet that not only gave me quick summaries, dates, times, and other important information about each communication in list format, but also gave me high-level trends: what were we seeing across all those communications, what issues were coming up more frequently this past month versus other months. That spreadsheet is now updated daily at 8 AM automatically by Cowork without any intervention on my end.

The workflow before that one was specifically around new onboards who are in training. I don't want to manage Slack channels and read all the comms every day about what's going on in their onboarding channels. So I have Cowork give me a little announcement at the end of the day with what I need to know about what went on in those channels regarding their onboarding process.

On the payments-partner spreadsheet, can you walk me through the setup a little more concretely—like, did you connect Gmail and Slack directly to Cowork, specify the partner domains and channels in a prompt, and then ask it to create or update a Google Sheet? How much of that workflow did it actually execute versus you copying things between tools?

It executed a hundred percent. I had to iterate a few times to clearly define the parameters. What I did was have it test initially for about a week, and then I went in manually to make sure it was capturing a hundred percent of the emails or Slack messages I knew had come in during that week, just to make sure nothing was missing. I did have to iterate a bit to specify the exact keywords and sender email domains that signaled importance. But aside from that one small test to confirm my prompting was specific enough, I am a hundred percent hands off. I open up the completed spreadsheet whenever I feel like looking at it, and it's been updated once a day at 8 AM.

When you say it created the spreadsheet, did Cowork decide the columns and trend summaries itself, or did you give it a pretty explicit schema? And how do you review whether those trends are actually right?

I always talk to my AI agents like I'm talking to a fourth grader—specific, clear, and concise. Every time I launch a prompt, I do a micro version first so I can, at a small scale, make sure it's acting correctly and picking up what I need it to pick up. That's why in the example I gave, I tested a week before having it pull back a full month.

In terms of the schema, I was very specific—I knew what I wanted. I said explicitly: I want these columns and rows. I gave it some artistic flexibility in terms of aesthetics, saying I wanted it to look beautiful and be something I could share with other members of the organization—not a messy, sloppy sheet. I wanted beautiful headers. At the top, I wanted a counter to track the number of operational emails, compliance-related emails, and so on. So it not only extracts the information but keeps a running counter at the top. Again, I asked for that very explicitly, but then it had the artistic flexibility to define what "beautiful" looked like.

Can you walk me through the last time you used Claude to make an actual product or code change—what were you trying to change, how did you work with Claude, and how did it get reviewed or shipped?

There was a before and after. I'm not technical traditionally—not an engineer by background. But I'm very familiar with my organization's data tables and data models. I know where our code lives, and I know the tables where certain pieces of information exist. While I don't have a formal engineering background, Claude—and this is Claude on top of Cursor more than Cowork—has enabled me to code properly. I can ship product just as any other engineer, because if you know the data tables and models, know the names of certain fields, and know exactly what to ask and where to ask it, you can ship most code. I've gotten feedback from our lead and founding engineers that some of my code is truly indistinguishable from that of classically trained engineers with many years of experience.

The last thing I shipped: I push to prod constantly—multiple PRs a day. The most recent was around a new KYC process. I didn't like any of the modals where users were notified if their KYC required an additional step or an additional upload, so I updated all the modals and warning modals for cases where more information was needed. I don't know how to code, but I know the name of the table that spits out someone's KYB or KYC status. I know how to ask for that information, and I can test using Claude—I can tell Claude where to look, what general formatting I need, and ask it to run certain checks automatically. Then Claude pushes it for review. The way we're set up, there are a few code owners who have to review the code. The code owner reviews it, and then it is shipped live.

For that KYC modal change, can you walk me through the workflow step by step—from noticing the modal needs changing, to Claude in Cursor making edits, to tests and checks, to PR review?

The time to ship is just unbelievable. I cannot imagine a world before Claude.

Since I run our trust and safety and customer support team, I was seeing multiple complaints around people not understanding that they had an additional step still missing, or that they had uploaded an incorrect document and we were not notifying them properly that there was still a step to complete.

We're a very data-backed organization, so first I gathered the data and looked at a ton of different examples of where our product was going wrong. I'll do a quick, informal consult with designers on our team or other teams that interact with that part of the product, just to make sure everyone is aligned on the change I'm going to make.

Then it's really as simple as this: I work in Cursor using Claude on top of it. It's a couple-line prompt. I know our KYB information and KYB status for users is stored on a certain table, and there's currently a pop-up that says a certain thing. I ask Claude to point me to where that modal lives in the codebase. Once Claude takes me to the exact part of the codebase, I'm a hundred percent sure I'm looking at the existing modal for the KYB and KYC process. Then I prompt it: I don't like this about that part of the product—please change it to this. From there, I ask Claude to merge, push, and run the automatic checks we have in place. Whenever I push a PR and the automatic checks return some sort of issue or conflict, I just copy-paste that conflict back into Claude, and Claude is able to fix it for me almost every single time. Then the code owner reviews it, and as soon as it's approved, I merge. It can be done in an hour—something that before would have required adding it to the roadmap, getting someone from product, getting a designer involved, and then finding an engineer to spare time. The fact that I—the person most intimately connected to a certain problem—can now own that end-to-end process through Claude is genuinely transformative. There is a before and after.

When you open the PR for something like that, what does the code owner actually review—are they treating it like any engineer's PR, or are there extra guardrails because you're a non-engineer using Claude?

We have pretty conservative guardrails for everyone. Humans are as prone to errors as AI can be. One of the biggest problems I see with sloppy code is that you're just in the wrong place, asking the wrong question, or using the wrong data tables. We have a ton of checks and guardrails in place across the board. The code owner might review with a somewhat more conservative eye given that the PR will say it was made by Cursor, so it's known it wasn't a pure human engineer. But across the board, our engineers are very heavily relying on AI right now. So: some additional scrutiny, but no additional guardrails.

When the code owner reviews one of your Claude/Cursor PRs, what kinds of issues do they most often catch—is it logic, edge cases, security/compliance, product wording, or just style?

Edge cases. Someone who has lived in the codebase longer has probably seen or can think about those edge cases in a way that I can't, or Claude can't.

Can you give me a concrete example of an edge case an engineer caught on one of your PRs—what did Claude miss, and how did that change how you prompt or review the next time?

This is very niche, but we have a few payout statuses. The change was turning on or off one specific payout status but not the others—very particular to that case.

When you're working on changes like payout or KYC flows, how do you decide what's safe for you to ship with Claude versus what needs an engineer to own from the start?

It's a matter of understanding the data tables on the back end. If I have a very strong grasp on the actual models that are going to be impacted by a change, I'm very confident actioning that myself. If you're dealing with complex payment questions, security changes, compliance changes, money movements—anything more sensitive—I would not want to own that. But anything where I fully understand the data model and the tables behind the question, and I fully understand how the product appears to the user on the front end, I don't necessarily need an intricate understanding of the back-end code, and Claude can make it work.

Can you walk me through how you learned enough of the data models and codebase to do this safely—was that mostly pre-existing company knowledge, or did Claude help you build that understanding over time?

This is pre-existing company knowledge. We're a very engineering-focused organization. Anytime I would interact with engineers, I learned a lot. I was working for many years before AI became as significant as it is now, so I had time on my side to really drill into this. Anytime I would bring a problem to an engineer, they would say: well, what model are we talking about? What table is that on? It was almost like a parent training a child, but it really helped me understand how the product works. At any tech organization, it's so important for the whole organization to be technical in the sense that you understand the model that actually makes the organization work.

For someone at Whop who doesn't have that same technical fluency, what would need to change in Claude Cowork or the surrounding process for them to safely contribute to product or ops workflows the way you do?

Claude would have to mirror how humans interacted with me. If I asked a question and an engineer said, well, what table is that on?—I would have to go find out. So if Claude could nudge someone and ask: what are you asking about? What is the service or part of the product where it appears? Claude is capable of finding that code and that table and model on the back end. So it absolutely could help. I've just been lucky enough to skip that whole learning part.

For your ops workflows, like the partner-requests spreadsheet or onboarding summaries, what's the review process after Cowork hands work back? What do you check before you trust it?

I always do a spot check—a micro version of the macro problem to thoroughly verify the question was answered correctly. In that particular case, I'll manually and quickly go and make sure that a hundred percent of what I know exists was caught and added to the spreadsheet. I do a small spot check, and then iterate on the prompt if anything is incorrect or missing.

Once that spot check passes and the workflow is running daily, have you had any cases where Cowork later drifted or silently missed something—like a new partner domain, a weird Slack thread, or a message format it didn't classify correctly?

Not yet, but of course there are cases where that does happen. It's very important to have iterative loops—your AI is as good as the feedback you give it and as good as the prompt that initially created it. People who set and forget and never look back or never iterate are probably going to get into trouble. You can't turn off your brain. You need to think critically. I see Claude as an assist, not a replacement for my brain, my critical thinking, and my assessment of the quality of work it produces. The same way you would correct and guide a human assistant back on track, you adjust the prompt.

Can you walk me through one time Cowork did drift or disappoint you? What was the task, what did it get wrong, and how did you correct it?

In broad strokes, because I don't want to reveal any confidential details—this was a research project on certain businesses. I had one prompt where I was plugging in a ton of businesses and wanted Cowork to scour the internet for certain pieces of information. Over time, it would spit out dead links or inaccurate links—not what I was looking for. At the beginning, when I was literally prompting it business by business, a hundred percent of the links were accurate and exactly what I needed. But over time it would spiral into dead links and weird links. I had to refocus it: this is what we're actually trying to find—find me the link associated with this. And then it went back on track.

When that happens, do you usually continue in the same Cowork thread and correct it, or do you restart with a cleaner prompt? And what signals tell you "okay, this task is no longer trustworthy"?

I always restart—a new question with the original prompt. Because the original prompt was working at the beginning, and I had tested and iterated on it to make sure it was in a good place, I'll open a new window and run the same prompt fresh.

As for the signals: you need to think critically. In the AI age, it is not a replacement for human intelligence. You can tell very easily when someone is using AI in a way that reflects a lack of critical thinking on their part. AI used properly is just a supercharge of the human brain—it's literally as good as your initial prompting and reconfiguring of it. When people say AI is going to take jobs—well, not if you can harness it for your own benefit. That's why it becomes so critical to constantly quality-check every piece of work you produce with AI.

When you use Claude Cowork versus regular Claude in the browser or Claude inside Cursor, what feels different? What kinds of tasks feel natural to hand to Cowork specifically?

When I'm not directly operating in a codebase—any sort of operational work—they feel pretty distinct. I use Claude in Cursor purely for code changes. Cowork is everything else: operational. Cursor is my engineer assistant, and Cowork is my everything-else EA.

The Cowork interface is so well-organized. You can set a project and have multiple prompts within the same project, keeping everything organized. The data connectors are very clean—it's very clear what Cowork has access to. You can segment different projects in different spaces. It feels very clean.

Can you walk me through what one of those Cowork "projects" looks like in practice? Like, what connectors does it have, what prompts or recurring tasks live inside it, and who besides you ever sees the outputs?

The connectors I use most seriously are my Gmail connector and my Slack connector—really for those communication channels. I also have web search connected. Honestly, I have a lot pulled in. The ones I use most are web search and Slack. I have a lot of context and standing instructions loaded in as well.

For those instructions and context you've loaded in, what do you find yourself repeatedly teaching Cowork about Whop, your teams, or your workflows? And what context do you wish it just remembered or understood better across projects?

I always start with a prompt covering what the goal is and what primary metrics I'm driving. Those metrics aren't changing, so it would be great if Cowork could remember them—maybe it does. I still re-state them every time, hammering down: this is what I'm doing, these are my team's focus areas, these are the metrics I'm trying to hit and drive, this is what's good, this is what's bad. I do it every time, probably to reinforce that context layer. Maybe it already has it memorized, but I like to do it again because I talk to my AI like I'm talking to a fourth grader—instructions are exceptionally clear. Honestly, I don't mind if it retains every single piece of information, because every piece of information just creates a stronger, broader context layer that can be used for everything else.

Can you give me a concrete example of those standing instructions? Like, what are the metrics or "this is good / this is bad" guidance you include before asking Cowork to do an ops task?

Without sharing any confidential metrics—it's something like: our goal is to hit a certain target GMV in a certain category. That kind of framing. What's important is contextualizing a small task within why we care about it and what we're ultimately trying to achieve. A task doesn't exist in a vacuum. There's a reason you're doing it and an ultimate aim you're trying to reach. Contextualizing a small task within that larger purpose makes my prompts stronger.

When Cowork produces something that other people at Whop will use—like the partner trends sheet or an onboarding summary—how does it move through the team? Do people know it came from Cowork, and do they edit, challenge, or rely on it differently because of that?

We're very AI-friendly. Nobody is scared. Everybody challenges everyone's work constantly—that's just part of our ethos and culture. The output will live in a spreadsheet that gets shared, an email that gets sent, or a Slack message. Any of the connector channels mean that information coming from Claude or Cowork ultimately lives in those channels, which are then shared. Comms get sent via email. Trackers go via Google Sheets. Comms go via Slack.

Where does the handoff still feel clunky? Are you ever copying, reformatting, or fixing something manually after Cowork creates it?

No, because I prompt it until it is correct. Then I never have to do it again. Maybe I have to reprompt once, but going forward I'm not copying, pasting, or reformatting any of that.

What's the clearest value you'd point to from Cowork today: hours saved, fewer dropped balls, faster response to partners, better team visibility, or something else?

Hours saved—but the reason hours saved matters is that it allows me to do higher-leverage work. The amount of time spent responding to a tedious email is something I never want to deal with again if it can be handled automatically. Hours saved matters for the leverage it provides in terms of what you're doing instead. No more tedious email work. Anything that has to be repeated as a process is just gone.

Walk me through one tedious email workflow that Cowork now handles almost end to end. What triggers it, what does Cowork draft or send, and where do you still stay in the loop?

When an email comes in, it gets pulled into a tracker, a draft response gets created, and that draft gets saved. I go in to review it, make sure it's correct, and then send it.

When you review those drafted emails, what are you mostly checking for: factual accuracy, tone, policy/compliance wording, whether it understood the partner's request, or something else?

All of the above.

Can you give me one recent example of a drafted email where you changed something before sending—what did Cowork get slightly wrong, and what did you edit?

Tone, mostly. I'm very particular about my voice, and I've tried to prompt my AIs to sound more like me, but it's tricky. I have a very distinct writing style. It's very easy to tell when an AI has tried to autocorrect—words like "make sure to ensure," or just small semantic changes that don't sound like me. You're going to have everybody typing and sounding the same if you completely lose the human voice. Voice probably matters least in the grand scheme, but it matters to the integrity of what I'm sending.

How have you tried to teach Cowork your voice? Like, do you give it examples of past emails, explicit style rules, or do you mostly correct drafts as you go?

I correct in real time. I haven't fed it a ton of my writing—that's interesting, and I'll consider it. I just haven't had the time to sit down and be intentional about it. But I correct in real time. If an emoji gets used—I hate emojis, my style does not involve them—or if wording gets used that I would never use, I'm pretty critical in my feedback: never use that word again, never use that emoji, I would never structure a sentence like this. I give that feedback in real time.

For the scheduled workflows, what would make you comfortable letting Cowork actually send emails or take actions without your final review? Or is there a category where you'd always want to stay in the loop?

For any external emails going to an external partner, the stakes are inherently higher. I would always want to be in the loop.

What about internal actions? Are there workflows where you'd let Cowork take the final step automatically—like posting a Slack summary, updating a tracker, assigning follow-ups, or creating tickets?

Absolutely, and it already does all of that—aside from maybe creating tickets. We don't have full AI ticket creation; that gets into the customer support side of things, which uses a different toolset.

Can you walk me through one workflow where Cowork does take the final action automatically—say, posting a Slack summary or updating a tracker? What's the trigger, what exactly does it post or update, and who relies on it?

We have a lot of what we call threads in our streams—Slack channels that get posted automatically every day. We'll have a thread around metrics: what are the metrics we care about, posted every single day so we know where they stand. A lot of people across the organization use Claude this way, so we definitely have those posted automatically in our Slack streams.

For those metric posts, is Cowork mostly pulling from existing dashboards and summarizing, or is it doing any analysis—like calling out anomalies, explaining why a metric moved, or recommending follow-ups?

All of the above. It's pulling the numbers, but also giving summaries, analysis, calling out anomalies, spikes, and anything else worth noting—in a sentence, what are we looking at and what should we care about.

When Cowork flags an anomaly or recommends a follow-up, how do people verify it? Like, does someone click through to the underlying dashboard, ask Cowork for sources, or just rely on the Slack summary unless something looks off?

The post is just an entry point, and we're looking at these metrics every single day. From there, it lives in a Slack channel, so the thread takes off—there's tons of lively discussion and debate around those automatic posts.

When those Slack threads take off, does Cowork ever participate again—like answering follow-up questions, pulling more context, or creating action items—or is its role mostly to generate the initial post?

Mostly the initial post, with the caveat that there's some analysis on top of it—"look at this, do this"—but it's not a definitive action item that gets posted somewhere else.

How much of this is your own power-user setup versus a shared Whop operating practice? Like, are teams standardizing around Cowork workflows, or is everyone building their own agents and automations?

AI is changing so much—every week there's a new model or something exciting that drops. Everyone is racing to keep up, and we have sharing rituals and practices, but there's no standardized SOP for how to use it across the organization. That's actually good right now, because so much is changing, and so much learning comes from that collaborative but individually curious building process. Then you say: I did this, it's cool and it's working for me—let me help you set it up the same way. There's no standard procedure, but everyone is using it in their own way, and there's a lot of knowledge sharing around what everyone is doing.

Can you walk me through one of those sharing rituals? Like, when someone builds a useful Cowork workflow, how do they show it to others, and what does it take for another person to actually adopt it?

We move very quickly. Our rituals include standard meeting cadences—a morning standup and an end-of-week all-hands. Those are the spaces where we share what we're doing. Everyone is working toward the same metrics, so if someone has a tool, setup, or Cowork prompt that's working very well, it gets shared in those spaces.

What would need to change for Cowork to move from "everyone has their own powerful setup" to more of a shared team workflow—like templates, permissions, audit trails, shared context, or something else?

We'd just need to feel confident that we have the right way to do it. Before you can make something the universal setup, you need to make sure it is the best universal setup, and we're just not there yet. Honestly, no company should be there yet, because so much is changing. Unless you have a full-time team dedicated to creating that perfect optimal setup, you should let people iterate.

Looking forward a year, what kinds of work do you expect you'll be delegating to Cowork or Claude that you still handle manually today?

I'm almost scared to predict, because thinking about where we were a year ago versus now, we're just going to be in an infinitely more exciting place. Anything that has to be repeated—the same process done more than once—give it to Cowork.

What's one repeatable workflow you still do manually today, where Cowork is almost good enough but not quite? What's the missing piece that keeps you from handing it off?

On the compliance side—more sensitive workflows. Anything around security, compliance, or money movements. Those are my big ones where I don't trust handing off fully to an AI, because the consequences are very serious. You want some level of human accountability, because at the end of the day you need someone to point a finger at, and you can't point a finger at an AI. Those more sensitive areas need to be fully owned by a human.

For those sensitive workflows, what role would you trust Cowork to play today? Like, gathering evidence, drafting a decision memo, flagging anomalies, preparing a checklist?

Everything up until the final push or send. Up until that final decision point, it can be the assist. But before whatever the finality is for that specific situation, a human has to be in the loop.

On willingness to pay, how do you think about the value of Cowork today? Like, if you had to justify it internally, what would you point to as the business case?

The ROI is very easy to illustrate. You would pay a human to do this work. Hours saved is a very justifiable, easy ROI.

Disclaimers

This transcript is for information purposes only and does not constitute advice of any type or trade recommendation and should not form the basis of any investment decision. Sacra accepts no liability for the transcript or for any errors, omissions or inaccuracies in respect of it. The views of the experts expressed in the transcript are those of the experts and they are not endorsed by, nor do they represent the opinion of Sacra. Sacra reserves all copyright, intellectual property rights in the transcript. Any modification, copying, displaying, distributing, transmitting, publishing, licensing, creating derivative works from, or selling any transcript is strictly prohibited.

Operations at Whop on using Claude to ship product & automate ops

Background

Questions

Interview

Disclaimers

Read more from

Anthropic

Head of Product Marketing at SaaS startup on automating product marketing with Claude Cowork

Claude Code vs. Cursor

Vibe coding index

Read more from
#ai

Arena revenue, growth, and valuation

$100M/year Nielsen of LLMs

$20M/year Replit for GCs

Create a free account, or log in.

Free article limit reached.

Standard membership required.

Standard membership required.

Background

Questions

Interview

Disclaimers

Read more from Anthropic

Head of Product Marketing at SaaS startup on automating product marketing with Claude Cowork

Claude Code vs. Cursor

Vibe coding index

Read more from #ai

Arena revenue, growth, and valuation

$100M/year Nielsen of LLMs

$20M/year Replit for GCs

Read more from

Anthropic

Read more from
#ai