Demystifying Meta's bet on new artificial intelligence weapons: two self-developed chips + supercomputing

Original: Tencent Technology

In the past few years, Facebook parent company Meta has invested heavily in the Metaverse and has been continuously working on the development of related hardware and software, perhaps even ignoring the latest trends in the field of artificial intelligence for this purpose. But as generative artificial intelligence exploded, Meta seemed to reorient the company and began to make efforts in the field of artificial intelligence. On Thursday local time in the United States, Meta released two self-developed chips for artificial intelligence, and revealed its latest progress in artificial intelligence supercomputing.

At a virtual event on Thursday, Meta demonstrated the internal infrastructure it has developed for AI workloads, including support for running generative AI, a new technology that the company has integrated into its newly launched ad design and creation tools. . This is an attempt by Meta to show its strength. Previously, the company has been slow to adopt AI-friendly hardware systems, undermining its ability to keep pace with rivals such as Google and Microsoft.

"Building our own hardware capabilities allows us to control every layer of the stack, from data center design to training frameworks," said Alexis Bjorling, Meta's vice president of infrastructure. "This level of vertical integration is necessary to move AI research forward."

Over the past decade or so, Meta has spent billions recruiting top data scientists and building new types of artificial intelligence, including the artificial intelligence that now powers the discovery engine, moderation filters, and ad recommendations in its apps and services. intelligent. But the company has struggled to turn many of its ambitious AI research innovations into products, especially when it comes to generative AI.

Until 2022, Meta will run its AI workloads by using CPUs and custom chips designed to accelerate AI algorithms. But Meta canceled the custom chip it was planning to roll out at scale in 2022 because it would require a major redesign of several of its data centers, and instead ordered multibillion-dollar Nvidia GPUs.

AI Accelerator Chip

To turn things around, Meta plans to start developing a more ambitious in-house chip, due to launch in 2025. This chip can be used to train artificial intelligence models and also support running them.

Meta calls the new chip the Meta Training and Inference Accelerator, or MTIA for short, and categorizes it as a "chip family" for accelerating AI training and inference workloads. "Inference" refers to running a trained model. An MTIA is an application-specific integrated circuit (ASIC), a chip that combines different circuits on a single circuit board, allowing it to be programmed to perform one or more tasks in parallel.

Figure 1: AI chips customized for AI workloads

Bjorling continued: "To achieve better efficiency and performance in our important workloads, we needed a custom solution that was co-designed with the model, software stack, and system hardware. provide a better experience.”

Custom artificial intelligence chips are increasingly becoming a staple of big tech companies. Google has developed a processor TPU (Tensor Processing Unit) for training large generative artificial intelligence systems such as PaLM-2 and Imagen. Amazon provides AWS customers with proprietary chips for training (Trainium) and inferencing (Inferentia). Microsoft is reportedly working with AMD on an in-house artificial intelligence chip called "Athena."

Meta said the company developed the first generation of MTIA (MTIA v1) in 2020 and produced it using a 7nm process. It can be expanded from 128 MB to 128 GB of memory, and in benchmarks designed by Meta, Meta claims that MTIA can handle "low-complexity" and "medium-complexity" AI models more efficiently than GPUs.

Meta said there is still a lot of work to be done in the areas of chip memory and networking, both of which remain bottlenecks as AI models grow in size and need to spread workloads across multiple chips. Coincidentally, Meta recently acquired the Oslo-based artificial intelligence network technology team of British chip unicorn Graphcore. As it stands, MTIA's focus is on rigorous inference, not training, on the "recommendation workload" for the Meta application family.

But Meta emphasized that the improving MTIA has "significantly" improved the company's efficiency when running recommendation workloads, allowing Meta to run "more enhanced" and "cutting-edge" AI workloads.

AI Supercomputer

Perhaps one day in the future, Meta will hand over most of its AI workloads to MTIA. But for now, the social networking giant is relying on its Research SuperCluster, its research-focused supercomputer.

The Research SuperCluster will debut in January 2022, assembled by Penguin Computing, Nvidia and Pure Storage, and has completed the second phase of construction. The Research SuperCluster now contains a total of 2,000 Nvidia DGX A100 systems with 16,000 Nvidia A100 GPUs, Meta said.

So why is Meta building a supercomputer in-house? First, there is pressure from other tech giants. A few years ago, Microsoft hyped its artificial intelligence supercomputer developed in cooperation with OpenAI, and recently said that it will cooperate with Nvidia to build a new artificial intelligence supercomputer on the Azure cloud. Meanwhile, Google is also touting its own artificial intelligence supercomputer, which has 26,000 Nvidia H100 GPUs, far surpassing Meta's supercomputer.

Figure 2: Meta's supercomputer for artificial intelligence research

But Meta says that in addition to keeping up with other peers, the Research SuperCluster allows its researchers to use real-world examples from Meta's system to train models. This differs from the company's previous AI infrastructure, which could only leverage open-source and publicly available datasets.

A Meta spokesperson said: "The Research SuperCluster AI supercomputer is used to advance AI research in several areas, including generative AI. This is actually closely related to the productivity of AI research. Providing state-of-the-art infrastructure to enable them to develop models and giving them a training platform to advance AI development."

At its peak, the Research SuperCluster could achieve 5 exaflops of computing power, which Meta claims is one of the fastest computers in the world. Meta says it uses Research SuperCluster to train LLaMA, a large language model. Earlier this year, Meta opened up access to researchers in a "closed release" of large language models. The largest LLaMA model was trained on 2048 A100 GPUs and took 21 days, Meta said.

"The Research SuperCluster will help Meta's AI researchers build new and better AI models that can learn from trillions of examples, working across hundreds of different languages, seamlessly," said a Meta spokesperson. Analyzing text, images, and video, developing new augmented reality tools, and more."

Video Transcoder

In addition to MTIA, Meta is developing another chip to handle specific types of computing workloads. Dubbed Meta Scalable Video Processor, or MSVP for short, the chip is Meta's first application-specific integrated circuit (ASIC) solution developed in-house and designed specifically to handle the processing demands of video-on-demand and streaming.

As some may recall, Meta began conceiving custom server-side video chips years ago, and in 2019 announced an ASIC for video transcoding and inference. MSVP is one of the fruits of these efforts, and a result of renewed competition in the streaming space.

“On Facebook alone, people spend 50% of their time watching videos. We need Serving various devices around the world (such as mobile devices, laptops, TVs, etc.), for example, videos uploaded to Facebook or Instagram are transcoded into multiple bitstreams with different encoding formats, resolutions and qualities, MSVP is programmable and scalable, and can be configured to efficiently support the high-quality transcoding required for VOD, as well as the low latency and faster processing times required for live broadcast."

Figure 3: Meta's custom silicon is designed to accelerate video workloads such as streaming and transcoding

Meta said the company's plan is to eventually offload most of its "stable and mature" video processing workloads to MSVP, using software video encoding only for workloads that require specific customization and "dramatically" improved quality. Meta also said that MSVP's work continues to improve video quality through pre-processing methods such as intelligent denoising and image enhancement, and post-processing methods such as artifact removal and super-resolution.

"In the future, MSVP will enable us to support more of Meta's most important use cases and requirements, including short videos, enabling efficient delivery of generative artificial intelligence, AR/VR and other virtual reality content," said Reddy and Chen Yunqing.

AI Focus

If there's one common thread among the latest hardware announcements, it's that Meta is desperately trying to accelerate the pace of AI development, especially when it comes to generative AI.

In February of this year, Meta CEO Mark Zuckerberg was said to have made improving Meta's AI computing power a top priority, announcing the formation of a new top-level generative AI team that, in his words, would provide the company with The development of "turbo charging". Meta CTO Andrew Bosworth also said recently that generative AI is the area where he and Zuckerberg spend the most time. According to Yang Likun, Meta’s chief scientist, the company plans to deploy generative artificial intelligence tools to create objects in virtual reality.

In April, Zuckerberg said on Meta's first-quarter earnings call: "We're exploring chat experiences in WhatsApp and Messenger, visual creation tools for posts and ads on Facebook and Instagram, and video over time. and a multimodal experience. I hope these tools will be valuable to everyone, from ordinary people to creators to businesses. For example, I predict that once we get this experience, there will be a lot of people interested in business information Interest in AI agents in delivery and customer support. Over time this will also extend to our work in virtual worlds, where it will be easier for people to create avatars, objects, worlds and tie all of that together code."

In some ways, Meta is feeling the mounting pressure from investors concerned that the company is not moving fast enough to capture a piece of the huge potential market for generative artificial intelligence. Currently, the company has no products that can compete with chatbots like Bard, Bing, or ChatGPT. Nor has much progress been made in image generation, another key area of explosive growth.

If these predictions are correct, the total addressable market size for generative AI software could reach $150 billion. US investment bank Goldman Sachs predicts that this will increase GDP by 7%.

Even if some of the predictions come true, it could make up for lost billions in Metaverse investments in augmented reality headsets, conferencing software and Metaverse technologies like Horizon Worlds. Reality Labs, Meta's augmented-reality division, posted a net loss of $4 billion last quarter and expects operating losses to continue to mount throughout 2023.

View Original
  • Reward
  • Comment
  • Share
Comment
No comments
  • Topic