On-demand Pricing for
Tokens-as-a-Service

Groq powers leading openly-available AI models.

Other models are available for specific customer requests including fine tuned models. Send us your inquiries here.

Large Language Models (LLMs)

AI ModelCurrent Speed(Tokens per Second)Input Token price(Per Million Tokens)Output Token Price(Per Million Tokens)
Llama 3.2 1B (Preview) 8k3100$0.04
(25M / $1)*
$0.04
(25M / $1)*
Llama 3.2 3B (Preview) 8k1600$0.06
(17M / $1)*
$0.06
(17M / $1)*
Llama 3.3 70B Versatile 128k275$0.59
(1.69M / $1)*
$0.79
(1.27M / $1)*
Llama 3.1 8B Instant 128k750$0.05
(20M / $1)*
$0.08
(12.5M / $1)*
Llama 3 70B 8k330$0.59
(1.69M / $1)*
$0.79
(1.27M / $1)*
Llama 3 8B 8k1250$0.05
(20M / $1)*
$0.08
(12.5M / $1)*
Mixtral 8x7B Instruct 32k575$0.24
(4.17M / $1)*
$0.24
(4.17M / $1)*
Gemma 7B 8k Instruct950$0.07
(14.29M / $1)*
$0.07
(14.29M / $1)*
Gemma 2 9B 8k500$0.20
(5M / $1)*
$0.20
(5M / $1)*
Llama 3 Groq 70B Tool Use Preview 8k335$0.89
(1.12M / $1)*
$0.89
(1.12M / $1)*
Llama 3 Groq 8B Tool Use Preview 8k1250$0.19
(5.26M / $1)*
$0.19
(5.26M / $1)*
Llama Guard 3 8B 8k765$0.20
(5M / $1)*
$0.20
(5M / $1)*
Llama 3.3 70B SpecDec 8k1600$0.59
(1.69M / $1)*
$0.99
(1.01M / $1)*

*Approximate number of tokens per $

Automatic Speech Recognition (ASR) Models

AI ModelSpeed FactorPrice(Per Hour Transcribed)
Whisper V3 Large189x$0.111*
Whisper Large v3 Turbo216x$0.04*
Distil-Whisper250x$0.02*

*For ASR models above, Groq charges a minimum of 10 seconds per request.

Vision Models

AI ModelInput Token Price(per M tokens)Output Token Price(per M tokens)
Llama 3.2 11B Vision 8k (Preview)$0.18*$0.18*
Llama 3.2 90B Vision 8k (Preview)$0.90*$0.90*

*For vision models, images are billed at 6,400 tokens per image.

For enterprise API solutions or on-prem deployments, please fill out the form on our Enterprise Access Page.

Never miss a Groq update! Sign up below for our latest news.