On-demand Pricing for
Tokens-as-a-Service
Groq powers leading openly-available AI models.
Other models are available for specific customer requests including fine tuned models. Send us your inquiries here.
Large Language Models (LLMs)
AI Model | Current Speed(Tokens per Second) | Input Token price(Per Million Tokens) | Output Token Price(Per Million Tokens) |
---|---|---|---|
Llama 3.2 1B (Preview) 8k | 3100 | $0.04 (25M / $1)* | $0.04 (25M / $1)* |
Llama 3.2 3B (Preview) 8k | 1600 | $0.06 (17M / $1)* | $0.06 (17M / $1)* |
Llama 3.1 70B Versatile 128k | 250 | $0.59 (1.69M / $1)* | $0.79 (1.27M / $1)* |
Llama 3.1 8B Instant 128k | 750 | $0.05 (20M / $1)* | $0.08 (12.5M / $1)* |
Llama 3 70B 8k | 330 | $0.59 (1.69M / $1)* | $0.79 (1.27M / $1)* |
Llama 3 8B 8k | 1250 | $0.05 (20M / $1)* | $0.08 (12.5M / $1)* |
Mixtral 8x7B Instruct 32k | 575 | $0.24 (4.17M / $1)* | $0.24 (4.17M / $1)* |
Gemma 7B 8k Instruct | 950 | $0.07 (14.29M / $1)* | $0.07 (14.29M / $1)* |
Gemma 2 9B 8k | 500 | $0.20 (5M / $1)* | $0.20 (5M / $1)* |
Llama 3 Groq 70B Tool Use Preview 8k | 335 | $0.89 (1.12M / $1)* | $0.89 (1.12M / $1)* |
Llama 3 Groq 8B Tool Use Preview 8k | 1250 | $0.19 (5.26M / $1)* | $0.19 (5.26M / $1)* |
Llama Guard 3 8B 8k | 765 | $0.20 (5M / $1)* | $0.20 (5M / $1)* |
*Approximate number of tokens per $
Automatic Speech Recognition (ASR) Models
For ASR models below, Groq charges a minimum of 10 seconds per request.
AI Model | Speed Factor | Price(Per Hour Transcribed) |
---|---|---|
Whisper V3 Large | 189x | $0.111 |
Whisper Large v3 Turbo | 216x | $0.04 |
Distil-Whisper | 250x | $0.02 |
For enterprise API solutions or on-prem deployments, please fill out the form on our Enterprise Access Page.
Never miss a Groq update! Sign up below for our latest news.