Microsoft has unveiled its newest in-house AI chip, the Maia 200, describing it as a silicon “workhorse” built to scale AI inference efficiently.

The Maia 200 succeeds the Maia 100, which debuted in 2023, and brings a significant leap in performance. According to Microsoft, the chip is engineered to run advanced AI models faster while consuming less power. It packs more than 100 billion transistors and delivers over 10 petaflops of performance at 4-bit precision, along with roughly 5 petaflops at 8-bit — a major upgrade over the previous generation.

Inference, the process of running trained AI models, is different from the compute-heavy task of training them. As AI businesses scale and mature, inference has become a growing share of total operating costs. That shift has intensified efforts across the industry to make inference more efficient and cost-effective.

Microsoft is positioning the Maia 200 as part of that solution. The company says the chip can help AI systems operate with less disruption and reduced energy consumption. “In practical terms, one Maia 200 node can effortlessly run today’s largest models, with plenty of headroom for even bigger models in the future,” Microsoft noted.

The launch also reflects a broader industry trend: major tech firms designing their own AI chips to reduce reliance on Nvidia, whose high-end GPUs have become central — and costly — to AI development. Google, for example, uses its custom Tensor Processing Units (TPUs), which are offered via its cloud rather than sold as standalone chips. Amazon has taken a similar path with its Trainium AI accelerators, recently rolling out the latest Trainium3 version in December.

In each case, these custom chips allow companies to offload portions of AI workloads that would otherwise run on Nvidia hardware, helping to control costs and reduce dependence on external suppliers — a strategy Microsoft is now doubling down on with the Maia 200.

LEAVE A REPLY

Please enter your comment!
Please enter your name here