Microsoft Azure Maia 200 AI Accelerator Unveiled Using TSMC 3nm Process for Inference

Microsoft unveils Azure Maia 200, a custom 3nm AI accelerator designed for efficient inference, featuring 140 billion transistors and HBM3e memory.
Microsoft Azure Maia 200 AI Accelerator Unveiled Using TSMC 3nm Process for Inference

Microsoft Unveils Azure Maia 200 A New Standard for AI Inference

Microsoft has officially introduced its latest custom-designed AI accelerator, the Azure Maia 200. The new chip, which used TSMC's advanced 3nm process technology to construct, marks a major advancement for the company's internal hardware development plan. The Maia 200, which was specifically created to manage fast AI inference tasks, will help businesses lower their AI token production costs while using less electricity.

Technical Specifications and Architecture

The Maia 200 has been built to handle the extensive data storage requirements used by contemporary Large Language Models (LLMs). The system delivers impressive computational capabilities through its single unit, which combines powerful processing abilities with extensive memory processing capacity.

The chip contains over 140 billion transistors which represent its total transistor count. The system has 216 GB of HBM3e memory which delivers 7 TB/s throughput together with 272 MB of high-speed on-chip SRAM. The Microsoft chip achieves 10 petaflops of FP4 performance together with 5 petaflops of FP8 performance according to company data. The System-on-Chip (SoC) maintains its power usage within the 750W TDP boundary.

The Maia 200 stands as the top-performing silicon solution which hyperscalers offer through their first-party hardware. The company's internal measurements together with their performance evaluations show that

  • The Maia 200 delivers three times better FP4 performance compared to Amazon's third-generation Trainium chip.
  • The FP8 performance of this system outperforms Google's seventh-generation TPU system.
  • The Maia 200 achieves power efficiency through its operation at 750W TDP which stands as one of two included power limits.
  • The new chip delivers 30% better performance per dollar according to Microsoft's previous Maia 100 model.

Strategic Deployment and Use Cases

Microsoft Azure uses the Maia 200 as its internal resource which Microsoft will not sell to external customers. The system operates pre-trained models through its ability to process low-bit floating-point formats (FP4 and FP8). Key deployments include

  • The chip will run inference for the latest GPT-5.2 models using OpenAI models.
  • The system will deliver processing power for Microsoft 365 Copilot and Microsoft Foundry operations.
  • The Microsoft Superintelligence team will deploy the chip to create synthetic data which they will use to build reinforcement learning models.
Microsoft Azure Maia 200 AI Accelerator Unveiled Using TSMC 3nm Process for Inference

Infrastructure and Networking

Microsoft has created a different two-tier scale-up network infrastructure to provide support for these accelerators through standard Ethernet technology. Each accelerator features 2.8 TB/s of dedicated bidirectional bandwidth. The system enables dense inference clusters to operate with 6,144 accelerators while maintaining high reliability.

Microsoft has installed the Maia 200 in its Iowa-based US Central datacenter, which will expand into the US West 3 region in Arizona. The Maia SDK provides developers access to PyTorch integration together with a Triton compiler, which enables them to optimize models for the new hardware.

About the author

mgtid
Owner of Technetbook | 10+ Years of Expertise in Technology | Seasoned Writer, Designer, and Programmer | Specialist in In-Depth Tech Reviews and Industry Insights | Passionate about Driving Innovation and Educating the Tech Community Technetbook

Post a Comment