04/15/2025

Now in Preview: Groq’s First Compound AI System

Build with access to the internet and the ability to run code with a one line change to your model string.

Compound Beta is Groq’s first compound AI system, released under preview on GroqCloud™. It combines openly available models already supported on our platform with built-in tool use, starting with web search and code execution, so developers can handle real-world queries in a single high-performance, low-cost API call.

What Can This Do That an LLM Can’t?

While LLMs are great at generating text, they are limited to what they were trained on. Compound Beta takes the next step. It is designed to solve problems by taking action, using tools like web search and code execution alongside powerful models.

This allows the system to access real-time information, perform live computations, and interact with external data. As a result, it delivers more accurate, current, and useful responses than a language model on its own.

How Compound Beta Works

Compound Beta uses iterative, server-side tool execution to answer complex queries. It can autonomously decide when and how to use tools, such as web search and code execution, and run them multiple times before returning a response.

All tool use happens server-side. This keeps latency low, avoids the need for client orchestration, and allows Groq to optimize performance end-to-end.

Fueled by Leading Openly Available Models, Including Llama 4

Compound Beta is powered by multiple openly available models already supported on GroqCloud, including the latest Llama 4 models. It uses Llama 4 Scout for core reasoning, with Llama 3.3 70B assisting with routing and tool selection.

These models work together as a system, enabling a more capable and flexible way to respond to real-world prompts that require more than language prediction alone.

Builders are charged per token used by each of the system's underlying models. For more details, click here.

What You Can Build

Compound Beta is ideal for developers building AI agents, assistants, and research tools that need to:

Search the web for current data
Run and validate code
Return grounded answers using live information or logical reasoning

Example prompts:

"What are the most recent models released by Groq?"
"What is the current value of 0.38474 Bitcoin based on CoinGecko?"
"What is the top trending news over the last 24 hours?”

Two Versions to Start

Compound Beta is available in two variants:

compound-beta: Uses Llama 4 Scout for reasoning and Llama 3.3 70B for tool use. Supports multiple tool calls per query.
compound-beta-mini: A faster version that supports one tool call per request. Ideal for lightweight or low-latency tasks.

While Compound Beta is a compound AI system, it uses the same API as models supported on GroqCloud. To get started, just change your model name:

model = "compound-beta"

model = "compound-beta-mini"

Check the compound-beta and compound-beta-mini docs for usage examples, supported tools, and integration guidance.

Performance

Compound Beta shows what’s possible when openly available models are combined with efficient infrastructure and real-time tools. With a saturation of existing benchmarks but a lack of one that can be automatically updated with information from the past 24 hours, Groq developed a new evaluation benchmark for measuring search capabilities called RealtimeEval.

This benchmark is designed to evaluate tool-using systems on current events and live data, plus we are also open sourcing this eval. On the benchmark, Compound Beta outperformed GPT-4o-search-preview and GPT-4o-mini-search-preview significantly.

We also found performance to be on-par with Perplexity Sonar.

Compound AI System	F1 Score
Perplexity Sonar Reasoning Pro	0.600
Groq Compound-beta	0.555
Perplexity Sonar	0.546
Groq Compound-beta-mini	0.478
OpenAI GPT-4o-search-preview	0.393
Open AI GPT-4o-mini-search-preview	0.382

compound-beta: ~350 tokens per second
compound-beta-mini: ~275 tokens per second

Try Compound Beta With Groq Desktop Beta

To try the Compound Beta without writing any code you can download the Groq Desktop Beta and add our Compound Beta MCP Server. It wraps the compound-beta and compound-beta-mini endpoints to delegate user questions that benefit from access to real-time information (web search) and/or Python code execution.

Help Shape What Comes Next

Compound Beta is available now under preview. We’re continuing to improve how models route requests, how tools are selected, and how the system adapts to real-world use cases.

Your feedback is key. Try it out and let us know what you build here.