Voltage Park customer at robotics company on GPU pricing and robotics computing needs

Background

We spoke with a customer from a robotics company who uses Voltage Park alongside their own GPU infrastructure for quantum mechanics and density functional theory workloads.

The conversation explores how they evaluate GPU cloud providers primarily on price and reliability, with insights on why specialized providers often outperform hyperscalers for GPU availability despite minimal differentiation in the market.

Key points via Sacra AI:

Voltage Park vs. hyperscalers: GPU cloud market is commoditized and price-driven with low switching costs. "I think there's barely any differentiation. In a past job, we saw a particular company we were working with literally shop around to different providers based on the best discount they could get. As soon as it ended, they moved to the next place. I don't think there's really a lot of differentiation, to be honest."
For specialized robotics workloads running quantum mechanics models, infrastructure-as-a-service beats higher-level platforms that lack flexibility. "We're using Voltage Park in a very infrastructure-as-a-service way. We provision our own clusters. Everything that we're doing with GPUs—we don't really want the high-level support that something like Fireworks is doing because we have pretty unique needs in the robotics space, and those companies don't really have solutions for us."
Cloud-to-owned infrastructure calculation tips at high monthly spend, but maintenance burden remains significant. "I think when the budget is being exceeded—when you're hitting essentially hundreds of thousands or millions of dollars a month. At that point, you should do the calculation of having your own GPUs as it's going to be more cost-effective. However, there is a pretty large maintenance burden for that, and energy bill."

Questions

Can you tell me which GPU providers you primarily work with today?
What kinds of workloads are you typically running across Voltage Park and your own infrastructure? And roughly, what's your monthly GPU spend or usage footprint?
When it comes to cloud GPU usage, do you think about training versus inference? Are you primarily training models or using cloud GPUs for inference too? And how does that affect your decision on provider?
When you're thinking about a provider like Voltage Park versus inference-first platforms like Fireworks AI or Together AI, do you see those as competitive or complementary? Could you imagine them converging over time?
Can you talk a little more about what made Voltage Park stand out for your needs? What were the key factors that differentiated them from other options?
Which other providers did you consider, and what made Voltage Park come out on top in the decision?
Do you see this market as commoditized and driven primarily by price, or do you think there's more differentiation happening in practice?
In that context, what do you think makes a provider like Voltage Park successful at retaining customers? If GPU hours are essentially fungible, are there forms of value they could provide beyond raw compute that might reduce churn or improve retention?
How much does ease of use—things like dashboard, provisioning UX, or API integration—actually factor into your choice of provider?
How easy or difficult is it for you to switch between different GPU platforms? Once you're already set up on one like Voltage Park, what's the switching cost like in practice for your team?
Do you think there's anything Voltage Park or other providers could do on the software or managed services side to make switching less attractive or improve lock-in?
There's a perception that Voltage Park tends to focus more on long-tail developers and early-stage companies, where someone like Lambda gears more toward established enterprise customers. Do you think that distinction holds up based on your experience?
Have you ever encountered any capacity issues with Voltage Park?
What would you say is the biggest reason someone would choose a specialized GPU provider over a hyperscaler?
What about pricing flexibility? Like not having to commit to long-term contracts or being able to scale usage up and down. Does that factor in for you?
Are there specific types of companies or teams that benefit most from specialized GPU cloud providers like Voltage Park, and who might be better off sticking with a big platform like AWS?
How do you personally think about categorizing or segmenting the different players in this GPU cloud landscape?
From your vantage point, what's one big misconception you've come across in the cloud GPU market?
What do you see as the biggest risk or challenge facing cloud GPU providers as a category right now?
Looking ahead 3 to 5 years, how do you see this space evolving? Are there any key shifts in hardware, pricing, or usage models that you expect to see?
Is there anything we haven't touched on that you think is important to understand about how teams like yours evaluate and use cloud GPU infrastructure? Or anything you'd tell a founder building in the space?
Have you ever hit any unexpected challenges or limitations using cloud GPU services?
Do you or teams you've worked with ever follow a multi-provider strategy intentionally, or are you generally focused on one platform at a time?
If you ever hit a scale point where you're considering building your own infrastructure again, is there a threshold at which it makes sense to move off cloud GPUs entirely? What might trigger that?
Is there any player or emerging company in the GPU cloud space that you're particularly watching or find intriguing right now?
If Voltage Park or a similar provider did want to move further upmarket and start courting larger enterprise customers, is there anything specific you think they'd need to offer to really win over security-minded or traditional IT buyers?
You mentioned earlier that your team is primarily focused on infrastructure as a service and that higher-level platforms aren't a good fit due to your custom setups and robotics workloads. If you were speaking directly with the product team at Voltage Park, and they asked what one improvement would create more value for a team like yours, what would you tell them?
On the downtime front, have you experienced much with Voltage Park?
Based on your experience with reserved pricing, is there anything GPU cloud providers could do to make long-term reservations more attractive or less risky for customers?
If you had to explain to someone new to GPU cloud infrastructure what the single biggest driver of growth in this market is right now, what would you say it is?
You mentioned you're running inference and training on your own infrastructure and through Voltage Park. Have you done any kind of GPU utilization benchmarking or performance tuning across different environments?
Based on that, is your team generally more focused on developer velocity than pure efficiency per GPU hour?
Are there any adjacent services or tools you wish existed that would make that developer velocity easier?
How would you ideally want to see such a tool integrated?
Are you training mostly vision models, reinforcement learning, SLAM systems, or what categories dominate your GPU usage?
Are there any special GPU needs or constraints that DFT imposes? And how well do GPU cloud providers support those today?
Have you found that most GPU cloud providers, like Voltage Park, offer GPUs that meet those FP64 performance needs reliably?

Interview

Can you tell me which GPU providers you primarily work with today?

We are working with Voltage Park but we also maintain our own infrastructure of GPUs.

What kinds of workloads are you typically running across Voltage Park and your own infrastructure? And roughly, what's your monthly GPU spend or usage footprint?

In terms of scale, we're using several A100s in a cluster to do high performance computing within the robotics space.

When it comes to cloud GPU usage, do you think about training versus inference? Are you primarily training models or using cloud GPUs for inference too? And how does that affect your decision on provider?

We are doing both. We use our GPUs for training and inference. It's easier just to use one provider.

When you're thinking about a provider like Voltage Park versus inference-first platforms like Fireworks AI or Together AI, do you see those as competitive or complementary? Could you imagine them converging over time?

I imagine them converging. However, we're using Voltage Park in a very infrastructure-as-a-service way. We provision our own clusters. Everything that we're doing with GPUs—we don't really want the high-level support that something like Fireworks is doing because we have pretty unique needs in the robotics space, and those companies don't really have solutions for us.

Can you talk a little more about what made Voltage Park stand out for your needs? What were the key factors that differentiated them from other options?

We purely went into this based on price and reliability.

Which other providers did you consider, and what made Voltage Park come out on top in the decision?

I don't remember the other vendors, but Voltage Park had the best deal.

Do you see this market as commoditized and driven primarily by price, or do you think there's more differentiation happening in practice?

I think there's barely any differentiation. In a past job, we saw a particular company we were working with literally shop around to different providers based on the best discount they could get. As soon as it ended, they moved to the next place. I don't think there's really a lot of differentiation, to be honest.

In that context, what do you think makes a provider like Voltage Park successful at retaining customers? If GPU hours are essentially fungible, are there forms of value they could provide beyond raw compute that might reduce churn or improve retention?

The only thing I can think of is making sure that they have newer GPU models and architecture designs. Beyond that, really having reservations of GPUs for a dedicated amount of time and guaranteeing those prices. And then if those prices go down, share that cost with the customers.

How much does ease of use—things like dashboard, provisioning UX, or API integration—actually factor into your choice of provider?

It's definitely a factor in terms of being able to just get machines up and running. But once we have machines running, we're installing all of our own software on there. So it's not really much of a problem. There needs to be some sort of API, especially when we have to scale out. But beyond that, it doesn't have to be that great. We're not doing anything special.

How easy or difficult is it for you to switch between different GPU platforms? Once you're already set up on one like Voltage Park, what's the switching cost like in practice for your team?

The switching cost is low. It would take us a matter of a day or two.

Do you think there's anything Voltage Park or other providers could do on the software or managed services side to make switching less attractive or improve lock-in?

I'm not sure why anyone would ever want to have lock-in be improved. We want less lock-in.

There's a perception that Voltage Park tends to focus more on long-tail developers and early-stage companies, where someone like Lambda gears more toward established enterprise customers. Do you think that distinction holds up based on your experience?

No. I don't think so. It's just another company that has infrastructure. As long as they have inventory available, you could use it. I was at a company that considered it, and I believe someone I know at a larger company used it too. It's just a matter of what kind of cost you can get out of them. People shop around. There's not a lot of loyalty to something like Lambda Labs. They're all pretty much fungible.

Have you ever encountered any capacity issues with Voltage Park?

No. We've never had that.

What would you say is the biggest reason someone would choose a specialized GPU provider over a hyperscaler?

Price.

What about pricing flexibility? Like not having to commit to long-term contracts or being able to scale usage up and down. Does that factor in for you?

The contracts matter. If we can get a lower cost over the year, we will definitely sacrifice having some performance loss of an earlier GPU when newer ones come out.

Are there specific types of companies or teams that benefit most from specialized GPU cloud providers like Voltage Park, and who might be better off sticking with a big platform like AWS?

My experience with at least GCP, not AWS, is that they have plenty of capacity issues. The region that you're in is a problem. In fact, being on a hyperscaler actually increases your chances of not having a GPU available because of how much popularity there is around wanting to get these GPUs. Voltage Park is solely focused on the problem of GPU availability. This isn't a core focus for the hyperscalers right now, in my experience.

How do you personally think about categorizing or segmenting the different players in this GPU cloud landscape?

There are the ones that are doing things like the normal cloud providers—IaaS, PaaS, and SaaS as well, like all the different Azure services. But as far as just providing us the infrastructure, that's all we need. Lambda Labs, Fireworks—those companies go up to a much higher level, and we don't want that. It costs so much money and provides no flexibility for us. We're not a typical company running LLMs.

From your vantage point, what's one big misconception you've come across in the cloud GPU market?

A misconception that comes up a lot is that you do not need the newest and greatest all the time. You can use some of the oldest GPUs that become cheaper and cheaper, and you can get just fine performance, especially for large-scale batch tests.

What do you see as the biggest risk or challenge facing cloud GPU providers as a category right now?

Energy usage when prices go up. It's a looming concern based on the multiple data centers being created and the switch to non-renewable energy will increase prices in the next 5 years.

Looking ahead 3 to 5 years, how do you see this space evolving? Are there any key shifts in hardware, pricing, or usage models that you expect to see?

I think certain types of models will become closed source and only run on certain GPUs. And these GPUs are always going to be NVIDIA. No matter who the provider is, I think they're going to try to build lock-in such that only certain GPUs are available on certain cloud providers, and you have to use their highest-level services to get the GPUs so that you're no longer running your own thing.

Is there anything we haven't touched on that you think is important to understand about how teams like yours evaluate and use cloud GPU infrastructure? Or anything you'd tell a founder building in the space?

I would tell a founder building in this space that if they can purchase older GPUs and host them themselves, they probably won't use as much energy and will be able to get a lot of cost-per-performance out of it. And they can also put it down as a depreciating asset, which will help them in their CapEx.

Have you ever hit any unexpected challenges or limitations using cloud GPU services?

It's been capacity issues on a cloud provider—one of the big hyperscalers. We had to use older GPUs or reservations to be able to reserve more capacity, and also underwent a project to see if we could be using GPUs in multiple regions, which required rearchitecture of software.

Do you or teams you've worked with ever follow a multi-provider strategy intentionally, or are you generally focused on one platform at a time?

Generally, one platform. Going with multiple platforms is a pretty huge cost.

If you ever hit a scale point where you're considering building your own infrastructure again, is there a threshold at which it makes sense to move off cloud GPUs entirely? What might trigger that?

I think when the budget is being exceeded—when you're hitting essentially hundreds of thousands or millions of dollars a month. At that point, you should do the calculation of having your own GPUs as it's going to be more cost-effective. However, there is a pretty large maintenance burden for that, and energy bill.

Is there any player or emerging company in the GPU cloud space that you're particularly watching or find intriguing right now?

No, there's no one I can think of. I don't really keep track of the rest of the space. Using Voltage Park has been pretty great and we'll keep using it until we hit too much expenditure.

If Voltage Park or a similar provider did want to move further upmarket and start courting larger enterprise customers, is there anything specific you think they'd need to offer to really win over security-minded or traditional IT buyers?

I don't think managed services, but things around the SLAs and reliability and what happens when there's an outage, continuity plans—all that kind of stuff. Especially knowing what happens around security, who has that physical GPU. There are probably a lot more typical certifications that you'd expect from enterprise providers, but I don't think that'd be too hard for them to do.

You mentioned earlier that your team is primarily focused on infrastructure as a service and that higher-level platforms aren't a good fit due to your custom setups and robotics workloads. If you were speaking directly with the product team at Voltage Park, and they asked what one improvement would create more value for a team like yours, what would you tell them?

Cheaper prices. Less downtime.

On the downtime front, have you experienced much with Voltage Park?

We've only had one instance or two, but nothing serious.

Based on your experience with reserved pricing, is there anything GPU cloud providers could do to make long-term reservations more attractive or less risky for customers?

I think they could offer ways out of the reserve pricing such that you can get it prorated if you want to switch to a higher-class GPU. That would be nice.

If you had to explain to someone new to GPU cloud infrastructure what the single biggest driver of growth in this market is right now, what would you say it is?

Probably the hype around AI and maybe VC money. That's by far the biggest.

You mentioned you're running inference and training on your own infrastructure and through Voltage Park. Have you done any kind of GPU utilization benchmarking or performance tuning across different environments?

No, I haven't really hit that level of utilization that would make that important. Those seem like optimizations that aren't really worth it versus spending a little bit more money for more capacity.

Based on that, is your team generally more focused on developer velocity than pure efficiency per GPU hour?

We're focused on developer velocity. We're a pretty small company, so we need to actually get this done so we can move on as a company. We can't spend that much time on optimization when the models we're using change almost daily.

Are there any adjacent services or tools you wish existed that would make that developer velocity easier?

I think better understanding of utilization of GPU and what's running on top of the GPU so we can optimize would be extremely helpful. In the same way that Intel's profiling tool for CPUs works on various architectures, it'd be really nice if you could have something like that for the GPU. That would really help profile and understand what's going slowly and what you're not utilizing well. It exists, but I haven't seen anyone talk about those tools.

How would you ideally want to see such a tool integrated?

I would see it integrated into my platform in a continuous way. I want to be able to see profiling over time, and I want it to be super low overhead.

Are you training mostly vision models, reinforcement learning, SLAM systems, or what categories dominate your GPU usage?

Density functional theory related to quantum mechanics.

Are there any special GPU needs or constraints that DFT imposes? And how well do GPU cloud providers support those today?

Floating point precision is extremely important.

Have you found that most GPU cloud providers, like Voltage Park, offer GPUs that meet those FP64 performance needs reliably?

It's been pretty reliable. We haven't really had many problems.

Disclaimers

This transcript is for information purposes only and does not constitute advice of any type or trade recommendation and should not form the basis of any investment decision. Sacra accepts no liability for the transcript or for any errors, omissions or inaccuracies in respect of it. The views of the experts expressed in the transcript are those of the experts and they are not endorsed by, nor do they represent the opinion of Sacra. Sacra reserves all copyright, intellectual property rights in the transcript. Any modification, copying, displaying, distributing, transmitting, publishing, licensing, creating derivative works from, or selling any transcript is strictly prohibited.

Voltage Park customer at robotics company on GPU pricing and robotics computing needs

Background

Questions

Interview

Disclaimers

Read more from

Fireworks AI

Fireworks AI customer at Hebbia on serving state-of-the-art models with unified APIs

Read more from

Lambda Labs

Lambda customer at Iambic Therapeutics on GPU infrastructure choices for ML training and inference

Lambda's IPO

GPU clouds growing 1,000% YoY

Read more from
#cloud-gpus

RunPod customer at Segmind on GPU serverless platforms for AI model deployment

CoreWeave of Europe

Fal.ai at $95M/year growing 4,650% YoY

Create a free account, or log in.

Free article limit reached.

Standard membership required.

Standard membership required.

Background

Questions

Interview

Disclaimers

Read more from Fireworks AI

Fireworks AI customer at Hebbia on serving state-of-the-art models with unified APIs

Read more from Lambda Labs

Lambda customer at Iambic Therapeutics on GPU infrastructure choices for ML training and inference

Lambda's IPO

GPU clouds growing 1,000% YoY

Read more from #cloud-gpus

RunPod customer at Segmind on GPU serverless platforms for AI model deployment

CoreWeave of Europe

Fal.ai at $95M/year growing 4,650% YoY

Read more from

Fireworks AI

Read more from

Lambda Labs

Read more from
#cloud-gpus