SPEED AND SCALE, FROM PROTOTYPE TO PRODUCTION

GroqCloud

The AI inference platform built for developers. Fast responses, scalable performance, and costs you can plan for. Available in public, private, or co-cloud instances.

Start for Free

Built for speed and precision

Groq runs the models you care about.

Take advantage of fast AI inference performance, powered by our purpose-built LPU, for leading GenAI models across text, audio, and vision modalities.

Support for LLMs, STT, TTS, and image-to-text models
Popular models on-demand
Industry standard frameworks and integrations

Start Building

Build now and scale as your needs grow

GroqCloud Plans

Free
Great for anyone to get started with our APIs.
- Build and Test on Groq
- Community Support
- Zero-data Retention Available
Price
$0
Start for Free
Developer
Great for developers and startups to scale up and pay as you go
Everything on the Starter Plan, plus:
- Higher Token Limits
- Chat Support
- Flex Service Tier
- Batch Processing
- Spend Limits
- Prompt Caching
Price
Pay Per Token
Get Started
Enterprise
Great for businesses who require custom solutions for large-scale needs
Everything on the Developer Plan, plus:
- Custom Models
- Regional Endpoint Selection
- Performance Tier
- Scalable Capacity
- Dedicated Support
- LoRA Fine-Tunes
Price
Contact Us
Get Started

Consistent Performance, Predictable Spend

Lower latency means less compute time, no batching required. Record-setting performance. Usage-based.

Try GroqCloud Now

What inference provider are you using or considering using to access models?

Source: Artificial Analysis AI Adoption Survey 2025

Designed for inference. Not adapted for it.

Established in 2016 for inference, Groq is literally built different. It’s the only custom-built inference chip that fuels developers with the performance they need at a cost that doesn’t hold them back.

Learn More About the LPU

On-Prem Optionality

GroqRack

Available by request, the LPU powering GroqCloud can be deployed on-prem with GroqRack. Ideal for regulated industries or air-gapped environments. Seamless transition between cloud and local deployment.

Inquire Now

Made to scale. Deployed globally.

Groq Data Center Deployments

Online now in four regions globally. Regional availability zones for minimal latency. Auto-scaling without overhead

Read Data Center News

Secure by Default

Enterprise-grade data encryption. SOC 2, GDPR, HIPAA compliant. Optional private tenancy for sensitive workloads.

Groq Trust Center

Featured

Fuel for Developers

Start Building For Free

Start Building

GroqCloud

Groq runs the models you care about.

GroqCloud Plans

Free

Developer

Enterprise

Consistent Performance, Predictable Spend

What inference provider are you using or considering using to access models?

Designed for inference. Not adapted for it.

GroqRack

Groq Data Center Deployments

Secure by Default

Featured

Introducing the Next Generation of Compound on GroqCloud

Introducing Kimi K2‑0905 on GroqCloud

Inside the LPU: Deconstructing Groq’s Speed

Fuel for Developers

GroqCloud

Groq runs the models you care about.

GroqCloud Plans

Free

Developer

Enterprise

Consistent Performance, Predictable Spend

What inference provider are you using or considering using to access models?

Designed for inference. Not adapted for it.

GroqRack

Groq Data Center Deployments

Secure by Default

Featured

Introducing the Next Generation of Compound on GroqCloud

Introducing Kimi K2‑0905 on GroqCloud

Inside the LPU: Deconstructing Groq’s Speed

Fuel for Developers

Introducing Kimi K2‑0905 on GroqCloud