Microsoft Launches MAI Family at Build with In-house Reasoning Model to Reduce Dependency on OpenAI

MAI-Thinking-1 features 35 billion active parameters, a context window of 256,000 tokens, and was trained without the distillation of third-party models, in a move Mustafa Suleyman described as true self-sufficiency.
Mustafa Suleyman took to the stage at Build 2026 on June 2 with seven proprietary models from Microsoft and a thesis: what he termed true self-sufficiency, a structured reduction of reliance on OpenAI for the generative AI stack that currently underpins Copilot, Azure AI Foundry, and part of GitHub. The flagship model is MAI-Thinking-1, the company’s first in-house reasoning model, boasting 35 billion active parameters in a sparse Mixture of Experts architecture and approximately one trillion total parameters, a context window of 256,000 tokens, and the detail that Suleyman reiterated twice: it was trained from scratch, without the distillation of third-party models.
The choice of clean training is strategic before being technical. Suleyman stated that independent evaluators preferred MAI-Thinking-1 over Anthropic’s Claude Sonnet 4.6 in blind side-by-side tests, and that the model achieves performance comparable to Opus 4.6 in the SWE-Bench Pro, a software engineering benchmark. Microsoft also reported a score exceeding 94% on the AIME 2026 for mathematical reasoning. However, independent verification of these results has yet to be published with complete raw data, and analysts have called for caution before considering the benchmark figures.
Seven Models, Three Fronts
In addition to MAI-Thinking-1, Microsoft released MAI-Code-1-Flash, a coding model with 5 billion parameters that is rolling out across all plans of GitHub Copilot and Visual Studio Code, and MAI-Image-2.5, which combines text-to-image and image-to-image capabilities and ranks second on the Arena AI editing leaderboard, ahead of Google's Nano Banana Pro. The remaining four models in the lineup cover voice, light multimodality, and flash variants for low-latency inference. Along with the new models, the company introduced Project Solara, a platform for agent-based systems, and provided details on the Maia 200 accelerator running on a supercomputing fabric named Fairwater.
Microsoft's thesis has become horizontal: proprietary models for customers wishing for clear data lineage and training without dependencies on cross-licensing, alongside OpenAI for those who prefer a cost-free frontier. In an interview with Semafor, Suleyman described the initiative as the greatest game of catchup ever played, acknowledging that Microsoft is working to reduce a capacity deficit in proprietary models built up since 2019.
For CIOs, the Reading is of Contractual Risk
For enterprise customers who have signed Enterprise Agreements with embedded Copilot, the decision of which model delivers the token shifts from a technical choice to a procurement negotiation. Microsoft now offers a path for companies needing to demonstrate the provenance of training data, a demand that has been accentuated by the requirements of the European AI Act and internal audits from major banks. Models with auditable lineage have been requested by regulated European clients since the beginning of the year, and MAI-Thinking-1 fits precisely into this category.
Independence, however, comes at a portfolio cost. Microsoft remains an Azure client of its own OpenAI for GPT inference and has financial stakes in the lab. Replacing GPT with MAI in Copilot reduces unit cost but compresses the margin on the cross contract. Anthropic, which last week filed a confidential S-1 registration with the SEC, valuing at $965 billion, sees Microsoft simultaneously as an Azure client and a direct competitor for Copilot subscription budgets.
Where the Impact Lands
For offshore software factories in Bengaluru, Pune, and Hyderabad, MAI-Code-1-Flash integrated into GitHub Copilot accelerates the transition to automated code review that TCS, Infosys, and Wipro have been designing since the second half of 2025. In delivery hubs in Poland and the Philippines, the effect is the same: low-complexity coding tasks shift to the model, and the headcount of associate-level staff relying on this type of work loses contractual justification more swiftly. For clients in the United States and Germany, the reading is different: productivity gains flow directly into the Copilot Enterprise contract, without a redistributive conflict over the delivery centre.
The suggested price for access to MAI-Thinking-1 via Azure was not disclosed on stage at Build, but analysts in the short term point to a cost per million tokens below that of Claude Sonnet 4.6 and GPT-5.1, which would make the transition economically rational for companies with high volumes of inference. Microsoft took five years to move away from being an exclusive reseller of OpenAI in the enterprise segment, and Build 2026 marks the first event where the company speaks as a publisher of proprietary models without ambiguity. The lingering question for CIOs is whether Azure's infrastructure can absorb, in parallel, the inference from GPT, Claude via Bedrock-Azure cross-stack, and the new MAI family without compromising SLA.