Microsoft unveils AI system that diagnoses complex medical cases with 85% accuracy
Microsoft’s AI research team has developed a new system that outperforms experienced physicians in solving the most challenging medical cases from the New England Journal of Medicine.

- Jul 25, 2025,
- Updated Jul 25, 2025 9:54 AM IST
Microsoft’s AI division has introduced a groundbreaking diagnostic tool that could reshape the future of clinical medicine. The new system, called the Microsoft AI Diagnostic Orchestrator (MAI-DxO), has demonstrated a remarkable ability to solve complex diagnostic cases, matching or exceeding the performance of experienced physicians in both accuracy and cost-efficiency.
Tested against 304 real-world patient cases published by the New England Journal of Medicine (NEJM), the AI system successfully diagnosed up to 85.5% of the cases. By contrast, a group of 21 physicians from the US and UK, each with 5 to 20 years of experience, achieved a 20% success rate. MAI-DxO also operated at a lower cost, simulating more efficient resource use in diagnostic testing.
“These cases are among the most diagnostically complex and intellectually demanding in clinical medicine,” Microsoft noted, highlighting the scale of the achievement.
The AI system works through what Microsoft terms the Sequential Diagnosis Benchmark (SD Bench), which simulates how a clinician would investigate a case in real life. The system asks questions, orders tests, and updates its reasoning at each step until it arrives at a final diagnosis.
MAI-DxO does not rely on a single model. Instead, it functions as an orchestrator that coordinates different AI models, acting as a virtual team of clinicians. The orchestration allows for layered reasoning and greater adaptability, especially in high-stakes situations. The best results came from pairing MAI-DxO with OpenAI’s o3 model.
To ensure realistic evaluation, each diagnostic step incurs a virtual cost, mirroring actual healthcare expenses. This framework allowed researchers to assess both diagnostic success and economic efficiency.
Crucially, the system did not simply order every available test. It was configured to make deliberate, cost-conscious decisions that mirror the constraints of real-world medical practice. Microsoft says the AI tool not only made better decisions but also did so while using fewer resources than human doctors or standalone AI models.
Beyond diagnostics, Microsoft is working across the healthcare spectrum. Other tools include Dragon Copilot, a voice-first AI assistant for clinicians, and RAD-DINO, which streamlines radiology workflows.
The research team acknowledges the current limitations of the system. While MAI-DxO excels in high-complexity diagnostics, its performance on more common, day-to-day patient presentations still needs further study. Additionally, real-world testing in clinical environments is essential before any large-scale deployment.
The company is now partnering with leading health institutions to rigorously evaluate the system under real clinical conditions. Microsoft emphasised that responsible governance and regulatory frameworks will be vital to ensuring that the technology is deployed safely and effectively.
“With further development, AI could empower patients to self-manage routine care and provide doctors with powerful tools to tackle difficult cases,” the research team stated. “We strongly believe the future of healthcare lies in augmenting human expertise with machine intelligence.”
For Unparalleled coverage of India's Businesses and Economy – Subscribe to Business Today Magazine
Microsoft’s AI division has introduced a groundbreaking diagnostic tool that could reshape the future of clinical medicine. The new system, called the Microsoft AI Diagnostic Orchestrator (MAI-DxO), has demonstrated a remarkable ability to solve complex diagnostic cases, matching or exceeding the performance of experienced physicians in both accuracy and cost-efficiency.
Tested against 304 real-world patient cases published by the New England Journal of Medicine (NEJM), the AI system successfully diagnosed up to 85.5% of the cases. By contrast, a group of 21 physicians from the US and UK, each with 5 to 20 years of experience, achieved a 20% success rate. MAI-DxO also operated at a lower cost, simulating more efficient resource use in diagnostic testing.
“These cases are among the most diagnostically complex and intellectually demanding in clinical medicine,” Microsoft noted, highlighting the scale of the achievement.
The AI system works through what Microsoft terms the Sequential Diagnosis Benchmark (SD Bench), which simulates how a clinician would investigate a case in real life. The system asks questions, orders tests, and updates its reasoning at each step until it arrives at a final diagnosis.
MAI-DxO does not rely on a single model. Instead, it functions as an orchestrator that coordinates different AI models, acting as a virtual team of clinicians. The orchestration allows for layered reasoning and greater adaptability, especially in high-stakes situations. The best results came from pairing MAI-DxO with OpenAI’s o3 model.
To ensure realistic evaluation, each diagnostic step incurs a virtual cost, mirroring actual healthcare expenses. This framework allowed researchers to assess both diagnostic success and economic efficiency.
Crucially, the system did not simply order every available test. It was configured to make deliberate, cost-conscious decisions that mirror the constraints of real-world medical practice. Microsoft says the AI tool not only made better decisions but also did so while using fewer resources than human doctors or standalone AI models.
Beyond diagnostics, Microsoft is working across the healthcare spectrum. Other tools include Dragon Copilot, a voice-first AI assistant for clinicians, and RAD-DINO, which streamlines radiology workflows.
The research team acknowledges the current limitations of the system. While MAI-DxO excels in high-complexity diagnostics, its performance on more common, day-to-day patient presentations still needs further study. Additionally, real-world testing in clinical environments is essential before any large-scale deployment.
The company is now partnering with leading health institutions to rigorously evaluate the system under real clinical conditions. Microsoft emphasised that responsible governance and regulatory frameworks will be vital to ensuring that the technology is deployed safely and effectively.
“With further development, AI could empower patients to self-manage routine care and provide doctors with powerful tools to tackle difficult cases,” the research team stated. “We strongly believe the future of healthcare lies in augmenting human expertise with machine intelligence.”
For Unparalleled coverage of India's Businesses and Economy – Subscribe to Business Today Magazine
