Announcements

Posts

We’re thrilled to have @MGKarch on our team to help customers cut through the noise and understand how to solve their biggest #LLM challenges.

We wanted to properly introduce @GroqInc to all of our new followers! 👋
We offer purpose-built inference solutions for real-time #AI at scale. Our hardware & software ecosystem includes the world’s first Language Processing Unit™ system for AI, Groq™ Compiler, and more

Insights

Year of the Compiler

Written by:
Jeremy Fowers

I’ve had an eye-opening month and I have three stories to share. Groq recently kicked off the release of our “early adopter” SDK, marking Groq Compiler as the primary means of programming GroqChip™ accelerators.

Now, I’ve built my career on kernel optimization – meticulously coding important workloads specifically for targeted hardware. Sometimes Verilog has felt a little too high level for my needs. While I’m happy to attend events here like Groq-a-thon – an all-day Groq hackathon – to try out the early adopter SDK, I’m also confident my habits and experience will send me back to my kernel optimizing ways the next morning.

blog interior MAR 22 Convo copy

But change is in the air. First, we get a drop of over 100 LSTM-based models from a customer. Before I can do so much as load up VS Code, my teammate Lev has kicked off a distributed cluster to run Groq Compiler on the entire drop. 

Not only do some compiled programs beat my hand-coded benchmark, but the whole set comes in at an average of 16x speedup over Nvidia A100 and redefines what is possible within the customer’s latency requirement. No kernel engineering needed here.

The next week, another customer sends us a model that mixes LSTM and Transformer layers. I could code this by hand, but I give Groq Compiler first crack at the problem. After a little slicing and dicing of the ONNX file – this compiler is good but not perfect yet – I am looking at a result that offers over 100x speedup compared to the reference implementation. 

blog interior MAR 22 converce copy

Finally, my teammate Chetan asks me if we should try supporting his favorite Transformer, ELECTRA. I’ve never heard of it before, and nobody at Groq had worked on it yet, but I may as well take a look. ELECTRA is an improved version of BERT that changes up the hyperparameters and adds a projection layer at the top. These differences sound benign, but it could still take considerable effort to adapt an optimized handwritten BERT into optimized ELECTRA.

So I put it through Groq Compiler, and it just worked. Same performance advantage as our BERT.

This week a friend asked me if I’m worried about my skill set being obsolete. I told him no, hung up the phone, and went merrily back to learning PyTorch.

Interested in seeing the Groq Compiler in action? Reach out to [email protected] to learn how you can participate in the early adopter program.

Header image credit: Photo by Behnam Norouzi on Unsplash

Image one credit: Photo by Yuichi Kageyama on Unsplash 

Image two credit: Photo by charlesdeluvio on Unsplash

Never miss a Groq update! Sign up below for our latest news.

Groq's latest news delivered to your inbox