Our fabulous EAs think of fun ways for GroqStars to meet one another, no matter where they live around the world. Speed friendship participants discovered a host of common interests, from cooking and hiking to woodworking and meditation.
#teambuilding #geoagnostic #remotework

Watch @DennisAbts, @GroqInc Chief Architect and Fellow, present on Groq’s #ISCA2022 paper, A Software-defined Tensor Streaming Multiprocessor for Large-Scale Machine Learning:
#WhyGroq #architecture #machinelearning #ml


Groq TSP Leads in Inference Performance

Written by
John Barrus

Today the Linley Group released its latest Microprocessor Report titled “Groq Rocks Neural Networks”, which concludes that Groq’s “TSP stands out in both peak performance and ResNet-50 throughput,” and that “Groq’s [deep-learning] accelerator is the fastest available on the merchant market.” The Linley Group’s report provides the most detailed overview of the novel Groq architecture available to date. You can download a copy of the Microprocessor Report below.

In the few weeks since our interview with the Linley Group, we’ve been able to improve the performance of our ResNet-50 v2 implementation. The TSP can now reach 21,700 IPS  (core compute) for Resnet-50 running at 900 MHz. Groq hardware clocks in at 18,900 IPS on real data, including I/O, with a latency of 0.05 msecs at batch size 1. Groq’s level of inference performance exceeds that of other commercially available neural network architectures, with throughput that more than doubles the ResNet-50 score of the incumbent GPU-based architecture. For real-time workloads which are sensitive to response time and rely on small batches, the TSPs batch size 1 performance is up to 17x faster than competing architectures.

With the Groq architecture providing a substantial performance advantage over GPU-based solutions, engineering managers can deploy machine learning platforms that offer twice the inference performance without doubling infrastructure costs. Reducing the number of deployed systems will lower power usage, save datacenter space, and significantly decrease system complexity.

To learn more about Groq, sign up for our mailing list at

ResNet-50 is an inference benchmark for image classification, and ResNet-50 v1.5 is part of a suite of MLPerf standards for measuring performance of machine learning accelerators.