On-demand Pricing for
Tokens-as-a-Service
Groq powers leading openly-available AI models.
Other models are available for specific customer requests including fine tuned models. Send us your inquiries here.
Large Language Models (LLMs)
AI Model | Current Speed(Tokens per Second) | Input Token price(Per Million Tokens) | Output Token Price(Per Million Tokens) | ||
---|---|---|---|---|---|
Llama 3.2 1B (Preview) 8k | 3100 | $0.04 (25M / $1)* | $0.04 (25M / $1)* | Try Now | Model Card |
Llama 3.2 3B (Preview) 8k | 1600 | $0.06 (17M / $1)* | $0.06 (17M / $1)* | Try Now | Model Card |
Llama 3.3 70B Versatile 128k | 275 | $0.59 (1.69M / $1)* | $0.79 (1.27M / $1)* | Try Now | Model Card |
Llama 3.1 8B Instant 128k | 750 | $0.05 (20M / $1)* | $0.08 (12.5M / $1)* | Model Card | |
Llama 3 70B 8k | 330 | $0.59 (1.69M / $1)* | $0.79 (1.27M / $1)* | Try Now | Model Card |
Llama 3 8B 8k | 1250 | $0.05 (20M / $1)* | $0.08 (12.5M / $1)* | Try Now | Model Card |
Mixtral 8x7B Instruct 32k | 575 | $0.24 (4.17M / $1)* | $0.24 (4.17M / $1)* | Try Now | Model Card |
Gemma 2 9B 8k | 500 | $0.20 (5M / $1)* | $0.20 (5M / $1)* | Try Now | Model Card |
Llama Guard 3 8B 8k | 765 | $0.20 (5M / $1)* | $0.20 (5M / $1)* | Try Now | Model Card |
Llama 3.3 70B SpecDec 8k | 1600 | $0.59 (1.69M / $1)* | $0.99 (1.01M / $1)* | Try Now | Model Card |
*Approximate number of tokens per $
Automatic Speech Recognition (ASR) Models
AI Model | Speed Factor | Price(Per Hour Transcribed) | ||
---|---|---|---|---|
Whisper V3 Large | 189x | $0.111* | Try Now | Model Card |
Whisper Large v3 Turbo | 216x | $0.04* | Try Now | Model Card |
Distil-Whisper | 250x | $0.02* | Try Now | Model Card |
*For ASR models above, Groq charges a minimum of 10 seconds per request.
Vision Models
AI Model | Input Token Price(per M tokens) | Output Token Price(per M tokens) | ||
---|---|---|---|---|
Llama 3.2 11B Vision 8k (Preview) | $0.18* | $0.18* | Try Now | Model Card |
Llama 3.2 90B Vision 8k (Preview) | $0.90* | $0.90* | Try Now | Model Card |
*For vision models, images are billed at 6,400 tokens per image.
For enterprise API solutions or on-prem deployments, please fill out the form on our Enterprise Access Page.
Never miss a Groq update! Sign up below for our latest news.