Revenue
$3.40M
2023
Valuation
$3.60B
2023
Funding
$901.13M
2024
Revenue
Groq has a significant cost advantage in silicon/chip manufacturing compared to Nvidia.
Groq's wafer costs are estimated at <$6,000 on 14nm, while Nvidia's H100 wafers on 5nm cost ~$16,000 plus expensive HBM memory.
Groq's simpler architecture and lack of external memory result in a raw silicon BOM cost per token that is 70% lower than a latency-optimized Nvidia system.
However, Groq requires vastly more chips (576 vs 8) and incurs much higher system level costs (personnel, power, rack space) to deliver a complete inference unit. When accounting for these full costs, Groq's TCO advantage for latency-optimized inference narrows to ~40%.
As Groq moves to 4nm chips in 2025, their cost advantages may expand further. But they will need significant capital to fund the development and production, while Nvidia is rapidly innovating with their next-gen H100 successor.
Valuation
Groq is valued at $2.8B as of their $640M funding round in August 2024, led by BlackRock Inc. funds and Cisco Systems Inc.'s investment arm.
Groq has raised over $1B in total funding. The latest August 2024 round saw participation from major investors including BlackRock, Cisco Systems, and Samsung Electronics' investment arm, alongside Tiger Global and D1 Capital Partners.
Product

Groq is a silicon company founded in 2016 by Jonathan Ross, the lead architect behind Google's Tensor Processing Unit (TPU).
Their core product is the Groq Tensor Streaming Processor (TSP), a custom AI accelerator chip designed specifically for high-performance, low-latency inference on large language models and other AI workloads.
Groq's chip has market-leading inference performance of over 500-700 tokens/second on large language models—a 5-10x improvement over Nvidia's latest data center GPUs.
Groq provides access to its TSP infrastructure through GroqCloud, a cloud service that allows developers to run large language models like Llama at unprecedented speeds via an API.
The company has also announced plans for on-prem enterprise deployments.
By optimizing for the inference workload through a ground-up hardware/software co-design, Groq aims to become the premier platform for real-time, low-latency AI use cases that are challenging for current GPU-based solutions.