Member of Technical Staff
I work on health AI within Microsoft AI, focused on using large language models for clinical applications. My most notable project is MAI-DxO (MAI Diagnostic Orchestrator), a model-agnostic orchestrator that simulates a panel of physicians to propose differential diagnoses and strategically select high-value, cost-effective diagnostic tests. When paired with OpenAI’s o3 model, MAI-DxO achieves 80% diagnostic accuracy on complex NEJM clinicopathological conference cases — four times higher than the 20% average of human clinicians on the same cases.
This work was published as Sequential Diagnosis with Language Models (arXiv, 2025) and introduced SDBench, an interactive benchmark for evaluating AI diagnostic agents through realistic sequential clinical encounters drawn from 304 consecutive NEJM cases.