Aug 09, 2024

Insights from VB Transform 2024 with Jonathan Ross, Groq CEO & Founder

Revolutionizing AI Inference: Efficiency and Scale with Groq

VentureBeat Transform 2024, hosted in July 2024 in San Francisco, CA, attracted 1000+ AI enthusiasts passionate about the conference’s theme, “Putting AI to Work at Scale.” Matt Marshall, CEO and Editor-in-Chief of VentureBeat, conducted a fireside chat with Jonathan Ross, CEO and founder of Groq, on why Groq matters, what problem Groq is solving, and the fastest hardware adoption in history.

Ross joined onstage and began with a demo demonstrating Groq Speed – an interactive voice-to-text dialogue running on GroqChat.

During the demo Ross asks GroqChat to create a two-day travel itinerary agenda to Oslo, Norway, convert the itinerary into a tabular format, add duration as a column between the time and the activity, accommodate a request to have lunch and dinner at a Michelin star restaurant each day, add a travel time, and oh wait, change the itinerary’s location all together – we’re going to Paris. The result? An instant updated result and the reaction Groq has become known for garnering – “Wow!”

So let’s start at the beginning - Groq was conceptualized, said Ross, when he realized that there was not enough compute for AI. Ross came to this conclusion because of his work at Google where he learned it costs ten times more to run an AI model in production (inference) than to train it. Ross’ mission for Groq became to make AI available to everyone.

Based on Ross’ experience at Google, Groq® LPU™ (Language Processing Unit) AI inference technology is designed on an architecture completely unique from existing AI chips (GPUs and CPUs). Groq published two technical papers describing the architecture, published at ISCA 2020 and ISCA 2022 respectively. Groq also recently released a white paper, What is a Language Processing Unit?, explaining the four key design principles of the LPU.

The Groq architecture powers fast AI Inference, and perhaps surprising to some, it is not necessarily built upon the most coveted and difficult-to-source hardware such as HBM - High Bandwidth Memory. In fact, Ross pointed out that external memory is what slows down AI inference, and that without it, Groq drives speed and efficiency beyond that of any GPU.

The Groq software-first architecture strategy enables developers to easily transition from other ecosystems like Nvidia or OpenAI. Ross also commented that while it seems that Groq is competing with Nvidia, that is not the case. GPUs are perfect for training AI models, while Groq LPUs are optimized to run those AI models in production environments, in inference – in a cheaper, faster, more efficient way.

As new models are made publicly available, Groq runs them the most efficiently. “We’re aiming to capture half of the global AI inference market by the end of next year,” shared Ross. To achieve this, Groq plans to deploy a staggering 1.7 million of its AI processors. “That would be the equivalent of 3x of what Nvidia deployed last year.”

This has led to unprecedented growth for Groq. At the time of the VentureBeat Transform event in July 2024, there were 280K developers on the GroqCloud™ Developer Console platform – meaning an increase of 280K in four months - the fastest any company has ever seen. Ross attributes this to the software-first architecture – not being tied to kernels enables models to be automatically compiled, and then put into production at lightning speed.

“As far as we know, in terms of any developer takeoff, any new hardware platform adoption, this is about as fast as it gets,” said Ross. He added, “We actually didn’t expect to go this viral this quickly.” Following the interview, VentureBeat published this article “Groq claims fastest hardware adoption in history at VB Transform”

Another factor to the developer growth has been the introduction of openly-available models into the AI community by Meta, Mixtral, Google, and others. At VB Transform Ross told the story of how Meta and Groq started, like all great relationships, at a party last October where he and the head of the Meta AI team were in attendance. Ross showed him Groq Speed and after he “stopped cursing,” he asked Ross to present to his team. When Llama became available on Groq, Groq went viral.

Fast forward two weeks from when the interview took place when Groq announced its partnership with Meta to launch Llama 3.1. As of publishing this blog, the Groq developer count is now over 374K and growing daily.

If you’re interested in more on the topics in this blog, watch the full interview below, read the original VentureBeat article, get your free API key via GroqCloud Developer Console to start building on Groq, or try instant AI inference now by logging into GroqChat – it’s free for everyone.

https://youtu.be/vbxIdwQCJgE

Revolutionizing AI Inference: Efficiency and Scale with Groq

Build Fast