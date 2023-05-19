Meta is making significant strides in artificial intelligence (AI) with the development of a new chip specifically designed for running AI models. The company, known for its family of apps and its vision of the metaverse, is embarking on an ambitious plan to build the next generation of AI infrastructure.

Meta's VP & Head of Infrastructure, Santosh Janardhan, shared details of their progress in a recent blog post. Alongside the custom silicon chip for AI models, Meta is also working on an AI-optimized data center design and the second phase of a massive 16,000 GPU supercomputer dedicated to AI research. These initiatives aim to facilitate the development and deployment of larger and more sophisticated AI models at scale.

Meta claims that AI already plays a pivotal role in Meta's products, contributing to better personalization, safer and fairer products, and enhanced user experiences. The company is even reimagining how coding is done with the deployment of CodeCompose, a generative AI-based coding assistant developed to boost developer productivity throughout the software development lifecycle.

The centerpiece of Meta's infrastructure advancements is the MTIA (Meta Training and Inference Accelerator), their in-house custom accelerator chip family. According to Meta, MTIA is designed specifically for inference workloads and offers superior compute power and efficiency compared to CPUs. Combining MTIA chips with GPUs will result in improved performance, reduced latency, and greater efficiency for each workload.

In addition, Meta is developing a data center that will support existing products while accommodating future generations of AI hardware. This AI-optimized design will feature liquid-cooled AI hardware and a high-performance AI network, connecting thousands of AI chips to create data center-scale AI training clusters. Meta claims that the new data center will be faster and more cost-effective to build.

Meta's Research SuperCluster (RSC) AI Supercomputer boasts 16,000 GPUs. The company claims it is one of the world's fastest AI supercomputers. The RSC was constructed to train large AI models that power augmented reality tools, content understanding systems, real-time translation technology, and more.

Meta's ability to custom-design its infrastructure from the physical to the virtual layer, including data centers, server hardware, and mechanical systems, allows for an optimized end-to-end experience. This control over the entire stack enables customization to suit specific needs, such as collocating GPUs, CPUs, and storage to support workloads efficiently. It also allows for rethinking power and cooling solutions as part of a cohesive system.

