Cerebras Systems unveils worlds fastest AI chip with whopping 4 trillion transistors
Published on : Saturday 16-03-2024
Third generation 5nm Wafer Scale Engine (WSE-3) powers industry’s most scalable AI supercomputers, up to 256 exaFLOPs via 2048 nodes.
.jpg)
Bangalore, March 15, 2024 – Cerebras Systems, the pioneer in accelerating generative AI, has doubled down on its existing world record of fastest AI chip with the introduction of the Wafer Scale Engine 3. The WSE-3 delivers twice the performance of the previous record-holder, the Cerebras WSE-2, at the same power draw and for the same price. Purpose built for training the industry’s largest AI models, the 5nm-based, 4 trillion transistor WSE-3 powers the Cerebras CS-3 AI supercomputer, delivering 125 petaflops of peak AI performance through 900,000 AI optimised compute cores.
Key Specs:
·4 trillion transistors
·900,000 AI cores
·125 petaflops of peak AI performance
·44GB on-chip SRAM
·5nm TSMC process
·External memory: 1.5TB, 12TB, or 1.2PB
·Trains AI models up to 24 trillion parameters, and
·Cluster size of up to 2048 CS-3 systems.
With a huge memory system of up to 1.2 petabytes, the CS-3 is designed to train next generation frontier models 10x larger than GPT-4 and Gemini. 24 trillion parameter models can be stored in a single logical memory space without partitioning or refactoring, dramatically simplifying training workflow and accelerating developer productivity. Training a one-trillion parameter model on the CS-3 is as straightforward as training a one billion parameter model on GPUs.
The CS-3 is built for both enterprise and hyperscale needs. Compact four system configurations can fine tune 70B models in a day while at full scale using 2048 systems, Llama 70B can be trained from scratch in a single day – an unprecedented feat for generative AI.
The latest Cerebras Software Framework provides native support for PyTorch 2.0 and the latest AI models and techniques such as multi-modal models, vision transformers, mixture of experts, and diffusion. Cerebras remains the only platform that provides native hardware acceleration for dynamic and unstructured sparsity, speeding up training by up to 8x.
“When we started on this journey eight years ago, everyone said wafer-scale processors were a pipe dream. We could not be more proud to be introducing the third-generation of our groundbreaking water scale AI chip,” said Andrew Feldman, CEO and co-founder of Cerebras.
“WSE-3 is the fastest AI chip in the world, purpose-built for the latest cutting-edge AI work, from a mixture of experts to 24 trillion parameter models. We are thrilled to bring WSE-3 and CS-3 to market to help solve today’s biggest AI challenges.”
With every component optimised for AI work, CS-3 delivers more compute performance at less space and less power than any other system. While GPUs power consumption is doubling generation to generation, the CS-3 doubles performance but stays within the same power envelope. The CS-3 offers superior ease of use, requiring 97% less code than GPUs for LLMs and the ability to train models ranging from 1B to 24T parameters in purely data parallel mode. A standard implementation of a GPT-3 sized model required just 565 lines of code on Cerebras – an industry record.
______________________________________________________________________________________________
For a deeper dive into the dynamic world of Industrial Automation and Robotic Process Automation (RPA), explore our comprehensive collection of articles and news covering cutting-edge technologies, robotics, PLC programming, SCADA systems, and the latest advancements in the Industrial Automation realm. Uncover valuable insights and stay abreast of industry trends by delving into the rest of our articles on Industrial Automation and RPA at www.industrialautomationindia.in