On-demand Pricing for
Tokens-as-a-Service

Groq powers leading openly-available AI models.

Other models are available for specific customer requests including fine tuned models. Send us your inquiries here.

Large Language Models (LLMs)

AI ModelCurrent Speed(Tokens per Second)Input Token price(Per Million Tokens)Output Token Price(Per Million Tokens)
Llama 3.2 1B (Preview) 8k3100$0.04
(25M / $1)*
$0.04
(25M / $1)*
Llama 3.2 3B (Preview) 8k1600$0.06
(17M / $1)*
$0.06
(17M / $1)*
Llama 3.1 70B Versatile 128k250$0.59
(1.69M / $1)*
$0.79
(1.27M / $1)*
Llama 3.1 8B Instant 128k750$0.05
(20M / $1)*
$0.08
(12.5M / $1)*
Llama 3 70B 8k330$0.59
(1.69M / $1)*
$0.79
(1.27M / $1)*
Llama 3 8B 8k1250$0.05
(20M / $1)*
$0.08
(12.5M / $1)*
Mixtral 8x7B Instruct 32k575$0.24
(4.17M / $1)*
$0.24
(4.17M / $1)*
Gemma 7B 8k Instruct950$0.07
(14.29M / $1)*
$0.07
(14.29M / $1)*
Gemma 2 9B 8k500$0.20
(5M / $1)*
$0.20
(5M / $1)*
Llama 3 Groq 70B Tool Use Preview 8k335$0.89
(1.12M / $1)*
$0.89
(1.12M / $1)*
Llama 3 Groq 8B Tool Use Preview 8k1250$0.19
(5.26M / $1)*
$0.19
(5.26M / $1)*
Llama Guard 3 8B 8k765$0.20
(5M / $1)*
$0.20
(5M / $1)*

*Approximate number of tokens per $

Automatic Speech Recognition (ASR) Models

For ASR models below, Groq charges a minimum of 10 seconds per request.

AI ModelSpeed FactorPrice(Per Hour Transcribed)
Whisper V3 Large189x$0.111
Whisper Large v3 Turbo216x$0.04
Distil-Whisper250x$0.02

For enterprise API solutions or on-prem deployments, please fill out the form on our Enterprise Access Page.

Never miss a Groq update! Sign up below for our latest news.