Tuesday, April 23, 2024

Facebook parent Meta announces the world’s fastest AI supercomputer

Share

Meta began designing the computing infrastructure “from a clean slate” in early 2020.

facebook-ai-research-supercluster.jpg

What you need to know

  • Meta has announced the AI Research SuperCluster (RSC), which it claims is among the fastest AI supercomputers in the world.
  • The RSC will help accelerate the company’s AI research and help build better AI models for the metaverse.
  • It is expected to be fully built out in mid-2022.

Facebook parent Meta on January 24 unveiled the AI Research SuperCluster (RSC), which it says is among the fastest AI supercomputers on the planet right now. Once it is fully built out in mid-2022, Meta claims it will be the fastest in the world.

RSC will aid AI researchers at Meta to develop better AI models capable of learning from trillions of examples, working across hundreds of languages, creating new AR tools, and more. More importantly, RSC will allow the company to make significant strides in AI-driven applications for building the metaverse.

With RSC, we can more quickly train models that use multimodal signals to determine whether an action, sound or image is harmful or benign. This research will not only help keep people safe on our services today, but also in the future, as we build for the metaverse.

RSC uses a total of 760 NVIDIA DGX 100 systems as compute nodes, for a total of 6,080 GPUs. RSC’s storage tier features 175 petabytes of Pure Storage FlashArray, 46 petabytes of cache storage, and 10 petabytes of Pure Storage FlashBlade.

meta-ai-research-supercluster-phase-1.jp

Meta plans to increase the number of GPUs to 16,000 by the end of the year, which should result in over 2.5x improvement in AI training performance. The storage system, on the other hand, will have a target delivery bandwidth of 16TB/s and exabyte-scale capacity.

Early benchmarks suggest RSC can run computer vision workflows up to 20 times faster than Meta’s previous systems. While a model with tens of billions of parameters took nine weeks to finish training on the company’s previous system, the same model can now finish training in three weeks on RSC.

Meta says training “increasingly large and complex models” is required to fully unlock the benefits of advanced AI for use cases such as identifying harmful content on various social media platforms that it owns. RSC is claimed to be capable of training models with data sets as large as an exabyte, which is the equivalent of 36,000 years of high-quality video.

Table of contents

Read more

More News