
06/16/2025 · Graham Steele
Build Faster with Groq + Hugging Face
Simplicity of Hugging Face + Efficiency of Groq
Exciting news for developers and AI enthusiasts! Hugging Face is making it easier than ever to access Groq’s lightning-fast and efficient inference with the direct integration of Groq as a provider on the Hugging Face Playground and API. This means developers can now tap into Groq’s unparalleled efficiency when it comes to speed, cost, and production-level context windows all with unified access and billing through the Huggingface platform.
Easy Access from Hugging Face Playground
Simply select "Groq" as your provider, and your requests will be billed directly to your Hugging Face account at Groq's competitive pricing. For those who prefer to manage their Groq usage directly, you also have the option to add your own Groq API key.
Seamless Integration with the Hugging Face API
You can also use Groq as a provider via the Hugging Face API. To get started, create a Hugging Face API key. If you prefer to use a Groq API key directly, change the routing mode in your inference provider settings to “custom key,” then add your Groq API key.
Note: To send requests to a third-party provider like Groq, you have to pass the provider parameter to the inference function. The default value of the provider parameter is "auto", which will select the first of the providers available for the model, sorted by your preferred order in https://hf.co/settings/inference-providers.
A Powerful Lineup of Powerful Models
This integration brings a fantastic array of leading open-source models powered by Groq, including:
- meta-llama/Llama-3.3-70B-Instruct
- google/gemma-2-9b-it
- meta-llama/Llama-Guard-3-8B
- meta-llama/Meta-Llama-3-70B-Instruct
- meta-llama/Meta-Llama-3-8B-Instruct
- deepseek-ai/DeepSeek-R1-Distill-Llama-70B
- meta-llamaLlama-4-Scout-17B-16E-Instruct
- meta-llama/Llama-4-Maverick-17B-128E-Instruct
- Qwen/QwQ-32B
- Qwen/Qwen3-32B
Build Fast
This collaboration between Hugging Face and Groq is a significant step forward in making high-performance AI inference more accessible and efficient for every AI builder.
Check out the documentation for more details.