
Microsoft has introduced a new image-generation model called MAI-Image-2, designed to create highly realistic visuals from text prompts—with a focus on creative workflows.
Built for photorealism
MAI-Image-2 is optimized for producing images with:
- Natural lighting
- Accurate skin tones
- Realistic, lived-in environments
Microsoft says the model was developed with input from photographers, designers, and visual creators to better match real-world creative needs.
A key advantage: readable text
One of the standout features is its ability to accurately generate text within images—a long-standing challenge for AI image tools. This makes it especially useful for:
- Infographics
- Presentations
- Diagrams
Beyond realism
While photorealism is a core strength, MAI-Image-2 can also handle:
- Surreal concepts
- Complex compositions
- Detailed fantasy environments
Rolling out now
The model is currently being integrated into Microsoft Copilot and Bing Image Creator, with broader experimentation available via Microsoft’s MAI Playground.
Climbing the ranks
According to Microsoft, MAI-Image-2 now places among the top three text-to-image models on the Arena.ai leaderboard, signaling strong competition in the rapidly evolving AI image space.

