Jul 23, 2024

Llama 3.1 by Meta Now Available on Groq

Llama 3.1 models are available via GroqChat and Groq Dev Console

The Llama 3.1 model suite is now available on Groq. Groq is proud to partner on this key industry launch making the latest Llama 3.1 models, including 70B Instruct and 8B Instruct, available to the community running at Groq speed. The three models are accessible on GroqCloud Dev Console, a community of over 550K developers already building on Groq® systems, and on GroqChat for the general public.

“I'm really excited to see Groq's ultra-low-latency inference for cloud deployments of the Llama 3.1 models. This is an awesome example of how our commitment to open source is driving innovation and progress in AI. By making our models and tools available to the community, companies like Groq can build on our work and help push the whole ecosystem forward.”

Mark Zuckerberg, Founder & CEO, Meta

Llama 3.1

The Llama 3.1 models are a significant step forward in terms of capabilities and functionality.

"Meta is creating the equivalent of Linux, an open operating system, for AI – not only for the Groq LPU which provides fast AI inference, but for the entire ecosystem. In technology, open always wins, and with this release of Llama 3.1 Meta has caught up to the best proprietary models. At this rate, it's only a matter of time before they'll pull ahead of the closed models,” said Jonathan Ross, CEO and Founder of Groq. “With every new release from Meta, we see a significant surge of developers joining our platform. In the last five months we've grown from just a handful of developers to over 300,000, attracted by the quality and openness of Llama, as well as its incredible speed on the Groq LPU.”

With LPU AI inference technology powering GroqCloud, Groq delivers unparalleled speed, enabling the AI community to build highly responsive applications to unlock new use cases such as:

Agentic Workflows: Supporting real-time decision-making and task automation to provide a seamless, yet personalized, human-like response for use cases such as: healthcare patient coordination and care; dynamic pricing by analyzing market demand and adjusting prices in real-time; predictive maintenance using real-time sensor data; and customer service by responding to customer inquiries and resolving issues in seconds.
Live Content Generation: Enabling the creation of dynamic, personalized content in real-time, such as customized product recommendations, news summaries, and social media responses.
Personalized Learning Pathways: Providing adaptive learning experiences that dynamically adjust to individual students' needs, abilities, and learning styles, enabling seamless navigation through complex educational content, such as interactive simulations, virtual labs, and AI-powered tutoring.

Llama 3.1 Model Capabilities

So, what makes the Llama 3.1 models so special? First off, Llama 3.1 405B is in a class of its own, with unmatched flexibility, control, and state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation. Here are some of the key updates and features for all of the Llama 3.1 models:

Increased Context Length: Up to 128K context length, enabling the model to process longer input sequences and respond more accurately.
Built-in Custom Tool Calling: Custom tool calling gives the ability to output custom function calls from a single user message, enabling easier tool calling.
Robust System-level Safety Support: Meta implemented new cybersecurity evaluations and several new or updated inference-time guardrails in Llama Guard 3 to ensure the safe and responsible use of these powerful models; build responsibly.
Llama Stack API: The initial Meta release of Llama Stack brings all these components together around a set of reference use cases, making it easier for developers to get started.

The Groq Journey with Llama

This isn't our first rodeo with Llama models – we have a track record of setting AI inference speed records. In March 2023, we announced speed records with the Llama 2 models, demonstrating our commitment to pushing the boundaries of LLM performance. With the Llama 3.1 models, we're taking it to the next level.

“Groq provides rapid AI inference using publicly accessible models, empowering everyone to harness AI's full potential. We created the LPU to offer fast, scalable AI inference at a much lower cost than GPUs," said Sunny Madra, General Manager of GroqCloud. "We believe that the pace of iteration drives innovation, a belief shared by our developer community of over 300,000 members. By openly and responsibly releasing Llama 3.1, we are demonstrating Meta's dedication to advancing and expanding an open AI ecosystem."

Start Building Today

For developers it’s an easy three step process to get started with Groq. Simply replace your existing industry standard API key with a free Groq API key, set the base URL, and run. Head over to the GroqCloud Dev Console today and start building with the latest Llama 3.1 models running at Groq speed!

Llama 3.1 models are available via GroqChat and Groq Dev Console

Llama 3.1

Llama 3.1 Model Capabilities

The Groq Journey with Llama

Start Building Today

Build Fast