Empowering AI Innovation
Groq builds fast AI inference technology. The Groq® LPU™ (Language Processing Unit) is a hardware and software platform that delivers exceptional AI compute speed, quality, and energy efficiency. Groq, headquartered in Silicon Valley, provides cloud and on-prem solutions at scale for AI applications. The LPU and related systems are designed and manufactured in North America.
Together with Cisco, we are delivering unparalleled AI readiness through exclusive enterprise access to GroqCloud™, purpose-built for AI performance at scale.
Interested in learning more about Groq? Fill out our form to be connected with a member of our team.
About Our Partnership
Overview
Cisco, a leader in network and security innovation, and Groq, a leader in AI inference technology, have partnered to provide exclusive access to dedicated compute resources for AI initiatives, addressing growing demand for high-performance AI solutions. The partnership combines Groq LPU AI inference technology with Cisco’s trusted infrastructure to help Cisco customers and partners seamlessly scale their AI initiatives, accelerate AI innovation, and drive business success.
Exclusive Benefits
As part of this strategic partnership, Cisco customers and partners gain access to dedicated GroqCloud capacity — an increasingly scarce and valuable resource for AI workloads. AI adoption often comes with challenges like limited resources, scalability concerns, and security requirements. Our partnership eliminates these barriers by offering guaranteed availability, avoiding delays and securing the compute power you need for AI innovation.
Easy Access to Fast AI Inference
Unlock a new set of use cases with AI applicatioans running at Groq speed. Powered by Groq LPU™ AI inference technology and available as public, private, and co-cloud instances, GroqCloud™ redefines real-time.
Get started for free by visiting GroqCloud Developer Console and join over half a million developers who are already building on GroqCloud.
Leading Openly-available Models
Take advantage of fast AI inference performance for leading openly-available Large Language Models and Automatic Speech Recognition models, including: Llama 3.1 8B, Llama 3 8B, Llama 3 70B, Llama 3.3 70B, Mixtral 8x7B, Gemma 2 9B, and Whisper Large V3.

No-code Developer Playground
Industry Standard Frameworks
Build cutting-edge applications leveraging industry-leading frameworks like LangChain, Llamaindex, and Vercel AI SDK. Create context-aware apps and enjoy real-time streamed UIs for dynamic, responsive applications that adapt to user needs.
Use Groq, Fast.
Move seamlessly to Groq from other providers like OpenAI by changing three lines of code.
- With our OpenAI endpoint compatibility, simply set OPENAI_API_KEY to your Groq API Key.
- Set the base URL.
- Choose your model and run!

Cost
You shouldn’t have to pay large upfront costs to start generating tokens. The Groq on-demand tokens-as-a-service model is simple – you pay as you go for the tokens consumed without any upfront costs. Explore our package and pricing options here.
Scalability
Groq systems are designed to scale in a way that is complementary to the modularity of AI workloads. Our novel innovations, from our C2C networking to our large overall distributed systems, prove Groq is a leading provider of fast AI inference speed at scale.
Reliability
Groq has various data center locations, increasing the reliability and accessibility of our systems to customers across the globe.
Groq Speed Is Instant
Don’t just take our word for it. Independent benchmarks from Artificial Analysis prove Groq Speed is instant for foundational openly-available models.
Groq Customer Use Cases
Never miss a Groq update! Sign up below for our latest news.