Now Available on Groq: The Largest and Most Capable Openly Available Foundation Model to Date, Llama 3.1 405B

Written by:
Groq
Llama 3.1 models are available via GroqChat and Groq Dev Console

The largest openly available foundation model to date, Llama 3.1 405B, is now available on Groq. Groq is proud to partner on this key industry launch making the latest Llama 3.1 models, including 405B Instruct, 70B Instruct, and 8B Instruct, available to the community running at Groq speed. The three models are accessible on GroqCloud Dev Console, a community of over 300K developers already building on Groq® systems, and on GroqChat for the general public. 

Llama 3.1 405B, The Largest Openly Available Model to Date

The Llama 3.1 models are a significant step forward in terms of capabilities and functionality. As the largest and most capable openly available Large Language Model (LLM) to date, Llama 3.1 405B rivals industry-leading closed-source models. For the first time, enterprises, startups, researchers, and developers can access a model of this scale and capability without proprietary restrictions, enabling unprecedented collaboration and innovation. With Groq, AI innovators can now tap into the immense potential of Llama 3.1 405B, running at record speeds, on GroqCloud to build more sophisticated and powerful applications.

“Meta is creating the equivalent of Linux, an open operating system, for AI – not only for the Groq LPU which provides fast AI inference, but for the entire ecosystem. In technology, open always wins, and with this release of Llama 3.1 Meta has caught up to the best proprietary models. At this rate, it’s only a matter of time before they’ll pull ahead of the closed models,” said Jonathan Ross, CEO and Founder of Groq. “With every new release from Meta, we see a significant surge of developers joining our platform. In the last five months we’ve grown from just a handful of developers to over 300,000, attracted by the quality and openness of Llama, as well as its incredible speed on the Groq LPU.”

With LPU AI inference technology powering GroqCloud, Groq delivers unparalleled speed, enabling the AI community to build highly responsive applications to unlock new use cases such as:

  • Agentic Workflows: Supporting real-time decision-making and task automation to provide a seamless, yet personalized, human-like response for use cases such as: healthcare patient coordination and care; dynamic pricing by analyzing market demand and adjusting prices in real-time; predictive maintenance using real-time sensor data; and customer service by responding to customer inquiries and resolving issues in seconds.
  • Live Content Generation: Enabling the creation of dynamic, personalized content in real-time, such as customized product recommendations, news summaries, and social media responses.
  • Personalized Learning Pathways: Providing adaptive learning experiences that dynamically adjust to individual students’ needs, abilities, and learning styles, enabling seamless navigation through complex educational content, such as interactive simulations, virtual labs, and AI-powered tutoring.

 

Llama 3.1 Model Capabilities

So, what makes the Llama 3.1 models so special? First off, Llama 3.1 405B is in a class of its own, with unmatched flexibility, control, and state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation. Here are some of the key updates and features for all of the Llama 3.1 models:

  • Increased Context Length: Up to 128K context length, enabling the model to process longer input sequences and respond more accurately.
  • Built-in Custom Tool Calling: Custom tool calling gives the ability to output custom function calls from a single user message, enabling easier tool calling.
  • Robust System-level Safety Support: Meta implemented new cybersecurity evaluations and several new or updated inference-time guardrails in Llama Guard 3 to ensure the safe and responsible use of these powerful models; build responsibly. 
  • Llama Stack API: The initial Meta release of Llama Stack brings all these components together around a set of reference use cases, making it easier for developers to get started.
 
Groq’s Journey with Llama

This isn’t our first rodeo with Llama models – we have a track record of setting AI inference speed records. In March 2023, we announced speed records with the Llama 2 models, demonstrating our commitment to pushing the boundaries of LLM performance. With the Llama 3.1 models, we’re taking it to the next level.

“Groq provides rapid AI inference using publicly accessible models, empowering everyone to harness AI’s full potential. We created the LPU to offer fast, scalable AI inference at a much lower cost than GPUs,” said Sunny Madra, General Manager of GroqCloud. “We believe that the pace of iteration drives innovation, a belief shared by our developer community of over 300,000 members. By openly and responsibly releasing Llama 3.1, we are demonstrating Meta’s dedication to advancing and expanding an open AI ecosystem.”

Start Building Today

For developers it’s an easy three step process to get started with Groq. Simply replace your existing industry standard API key with a free Groq API key, set the base URL, and run. Head over to the GroqCloud Dev Console today and start building with the latest Llama 3.1 models running at Groq speed! 

Up Next

Early API access to Llama 3.1 405B is currently available to select Groq customers only – stay tuned for general availability. We will also be sharing independent 3rd party benchmarks demonstrating Groq speed across Llama 3.1 models very soon. 

The latest Groq news. Delivered to your inbox.