Six months after renegotiating a contract that once restricted independent AI development, Microsoft has launched three proprietary models challenging its $13 billion partner. MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 are available in Microsoft Foundry, not bearing OpenAI’s label.
These models are the first public creations from the MAI Superintelligence team, formed by Mustafa Suleyman, Microsoft AI CEO, in November 2025, aiming for “humanist superintelligence.” A March internal memo, reported by Business Insider, details Suleyman’s focus on superintelligence and Microsoft’s five-year world-class model ambition, giving the first concrete outcomes.
MAI-Transcribe-1 appears most disruptive, boasting the lowest word error rate across 25 languages at 3.8% on the FLEURS benchmark. Microsoft claims it outperforms OpenAI’s Whisper-large-v3 across all 25 languages, Google’s Gemini 3.1 Flash on 22, and ElevenLabs’ Scribe v2 on 15. It processes audio 2.5 times faster than Azure Fast, pricing at $0.36 per hour. Notably, a team of just 10 people developed it.
MAI-Voice-1 offers text-to-speech, generating 60 seconds of audio in less than a second on one GPU, supporting custom voice creation with minimal audio input. Paired with MAI-Transcribe-1 and a customer-chosen language model, it forms a complete voice solution running on Microsoft’s infrastructure, independent of OpenAI.
MAI-Image-2, the oldest model, reached number three on the Arena.ai text-to-image leaderboard in March, trailing Google’s Gemini 3.1 Flash and OpenAI’s GPT Image 1.5. Developed with photographers, designers, and visual storytellers, it’s being used by WPP, a top marketing group.
Strategically, the context supersedes benchmarks. Until September 2025, Microsoft and OpenAI’s partnership precluded independent AI pursuits. A revised memorandum changed that, granting Microsoft license rights to OpenAI’s creations through 2032, $250 billion in Azure commitments, and the freedom to develop competing models. Suleyman noted this shift enabled independent superintelligence pursuit.
The timing aligns with Suleyman’s role change; Jacob Andreou’s appointment as Copilot EVP on March 17 freed Suleyman from product duties, leading to the MAI model debut weeks later. Additionally, Ali Farhadi, ex-CEO of Allen Institute for AI, joined Suleyman’s team in March, indicating ambitions beyond transcription and image generation.
For OpenAI, it’s a complicated dynamic; while Microsoft is its largest investor and primary cloud provider, both share Foundry, hosting OpenAI and Microsoft models. OpenAI’s commercial monetization push accelerates, making the relationship resemble overlapping market players rather than a defined partnership. OpenAI’s $110 billion February raise, supported by SoftBank, Nvidia, and Amazon, values it independently, outdated partnership perspectives aside.
The AI model market fragments along these lines. Anthropic’s $30 billion raise at $380 billion valuation positions it as a significant enterprise AI force with $14 billion revenue. Google continues iterating Gemini. OpenAI’s once-dominant frontier AI role is challenged, ending Microsoft’s sole distributorship role.
Microsoft Foundry, previously Azure AI Foundry and Studio, now serves over 80,000 enterprises, including 80% of Fortune 500. This reach makes the MAI model family significant: Microsoft needn’t surpass OpenAI in every metric to sway enterprise spending; it must be competitive for customers to prefer integrated options over third parties, a trend increasingly plausible given recent AI consolidations.
Suleyman indicates that frontier-class language models may take another year or two. What’s been established is a foundation—a toolkit granting Microsoft independent voice, ears, and eyes from OpenAI. The $13 billion partnership isn’t ending; however, the foundational premise that Microsoft relied on OpenAI for AI competition is being quietly dismantled with each model release.
