| CARVIEW |
Select Language
HTTP/1.1 200 OK
Server: nginx
Date: Mon, 29 Dec 2025 10:52:02 GMT
Content-Type: text/html
Content-Length: 11691
Connection: keep-alive
Last-Modified: Tue, 14 Jan 2025 20:58:37 GMT
ETag: "ff53-62bb0d3263940-gzip"
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
MIT Clinical ML
Previous
Next
MIT Clinical Machine Learning Group
MIT Clinical Machine Learning Group
MIT Clinical Machine Learning Group
About the Lab
Led by David Sontag, the Clinical Machine Learning Group is interested in advancing machine learning and artificial intelligence, and using these techniques to advance health care.
Broadly, we have two goals:
- Clinical: To make a difference in health care, we need to create algorithms that are useful for solving real clinical problems.
- Machine learning: We need rigorous solutions, which can pave the way for safe deployment of machine learning in high-stakes settings like healthcare.
News
- August 2024 “Towards Verifiable Text Generation with Symbolic References” is accepted at COLM 2024
- August 2024 Shannon presented “Learning to Decode Collaboratively with Multiple Language Models” at ACL 2024
- July 2024 Ilker presented “Prediction-powered Generalization of Causal Inferences” at ICML 2024
- July 2024 “Joint AI-driven event prediction and longitudinal modeling in newly diagnosed and relapsed multiple myeloma” was published in Nature Digital Medicine
- June 2024 Stefan presented “A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models” at CHIL 2024
- May 2024 Ilker presented “Benchmarking Observational Studies with Experimental Data under Right-Censoring” at AISTATS 2024
- December 2023 Rebecca presented “A Deep Dive into Single-Cell RNA Sequencing Foundation Models” at MLCB 2023
- August 2023 Sharon presented “Conceptualizing Machine Learning for Dynamic Information Retrieval of Electronic Health Record Notes” at MLHC 2023
- April 2023: We have three papers at AISTATS 2023, with
- Hussein presenting “Who Should Predict? Exact Algorithms For Learning to Defer to Humans”
- Zeshan presenting “Falsification of Internal and External Validity in Observational Studies via Conditional Moment Restrictions”
- Zeshan and Ahmed presenting “Conformalized Unconditional Quantile Regression”
- December 2022 Monica presented “Large Language Models are Few-Shot Clinical Information Extractors” at EMNLP 2022
- Nov. 2022: Three papers at NeurIPS 2022, including
- Michael presenting “Evaluating Robustness to Dataset Shift via Parametric Robustness Sets”
- Zeshan presenting “Falsification before Extrapolation in Causal Effect Estimation”
- Hunter presenting “Training Subset Selection for Weak Supervision”
- July 2022: We have two papers at ICML 2022, with
- Hunter and Monica presenting “Co-training improves prompt-based learning for large language models”
- Hussein presenting “Sample Efficient Learning of Predictors that Complement Humans”
Team
David Sontag
Professor of EECS
Rebecca (Peyser) Boiarsky
PhD student
Hunter Lang
PhD Student
Shannon Shen
PhD Student
Ilker Demirel
PhD Student
Lawrence Shi
MEng Student
Featured Publications
Learning to Decode Collaboratively with Multiple Language Models
We propose a method to teach multiple large language models (LLM) to collaborate by interleaving their generations at the token level. …
Shannon Shen, Hunter Lang, Bailin Wang, Yoon Kim, David Sontag
2024
ACL 2024
Prediction-powered Generalization of Causal Inferences
Causal inferences from a randomized controlled trial (RCT) may not pertain to a target population where some effect modifiers have a …
Ilker Demirel, Ahmed Alaa, Anthony Philippakis, David Sontag
2024
ICML 2024
Effective Human-AI Teams via Learned Natural Language Rules and Onboarding
People are starting to rely on AI agents to assist them with various tasks and thus forming human-AI teams. The human must know when to …
Hussein Mozannar, Jimin J Lee, Dennis Wei, Prasanna Sattigeri, Subhro Das, David Sontag
2023
Advances in neural information processing systems (NeurIPS)
Who Should Predict? Exact Algorithms For Learning to Defer to Humans
Algorithmic predictors should be able to defer the prediction to a human decision maker to ensure accurate predictions. In this work, …
Hussein Mozannar, Hunter Lang, Dennis Wei, Prasanna Sattigeri, Subhro Das, David Sontag
2023
Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS)
Evaluating Robustness to Dataset Shift via Parametric Robustness Sets
We give a method for proactively identifying small, plausible shifts in distribution which lead to large differences in model …
Nikolaj Thams, Michael Oberst, David Sontag
2022
Advances in Neural Information Processing Systems (NeurIPS)
Falsification before Extrapolation in Causal Effect Estimation
Randomized Controlled Trials (RCTs) represent a gold standard when developing policy guidelines. However, RCTs are often narrow, and …
Zeshan Hussain, Michael Oberst, Ming-Chieh Shih, David Sontag
2022
Advances in Neural Information Processing Systems (NeurIPS)
Co-training Improves Prompt-based Learning for Large Language Models
We demonstrate that co-training (Blum & Mitchell, 1998) can improve the performance of prompt-based learning by using unlabeled …
Hunter Lang, Monica Agrawal, Yoon Kim, David Sontag
2022
Proceedings of the Thirty-Eighth International Conference on Machine Learning (ICML)
ETAB: A Benchmark Suite for Visual Representation Learning in Echocardiography
Echocardiography is one of the most commonly used diagnostic imaging modalities in cardiology. Application of deep learning models to …
Ahmed Alaa, Anthony Philippakis, David Sontag
2022
Advances in Neural Information Processing Systems Datasets and Benchmarks Track
Neural Pharmacodynamic State Space Modeling
Modeling the time-series of high-dimensional, longitudinal data is important for predicting patient disease progression. However, …
Zeshan Hussain, Rahul G Krishnan, David Sontag
2021
Proceedings of the Thirty-Eighth International Conference on Machine Learning (ICML)
Regularizing towards Causal Invariance: Linear Models with Proxies
We propose a method for learning linear models whose predictive performance is robust to causal interventions on unobserved variables, …
Michael Oberst, Nikolaj Thams, Jonas Peters, David Sontag
2021
Proceedings of the Thirty-Eighth International Conference on Machine Learning (ICML)
Deep Contextual Clinical Prediction with Reverse Distillation
Healthcare providers are increasingly using machine learning to predict patient outcomes to make meaningful interventions. However, …
Rohan Kodialam, Rebecca Boiarsky, Justin Lim, Neil Dixit, Aditya Sai, David Sontag
2021
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence
A decision algorithm to promote outpatient antimicrobial stewardship for uncomplicated urinary tract infection
Antibiotic resistance is a major cause of treatment failure and leads to increased use of broad-spectrum agents, which begets further …
Sanjat Kanjilal, Michael Oberst, Sooraj Boominathan, Helen Zhou, David C. Hooper, David Sontag
2020
Science Translational Medicine
Characterization of Overlap in Observational Studies
Overlap between treatment groups is required for non-parametric estimation of causal effects. If a subgroup of subjects always receives …
Michael Oberst, Fredrik D. Johansson, Dennis Wei, Tian Gao, Gabriel Brat, David Sontag, Kush R. Varshney
2020
Proceedings of the Twenty-Third International Conference on Artificial Intelligence and Statistics (AISTATS)
Fast, Structured Clinical Documentation via Contextual Autocomplete
We present a system that uses a learned autocompletion mechanism to facilitate rapid creation of semi-structured clinical …
Divya Gopinath, Monica Agrawal, Luke Murray, Steven Horng, David Karger, David Sontag
2020
Proceedings of the Machine Learning for Healthcare Conference
Recent Publications
Quickly discover relevant content by filtering publications.
Towards Verifiable Text Generation with Symbolic References
Large language models (LLMs) have demonstrated an impressive ability to synthesize plausible and fluent text. However they remain …
Lucas Torroba Hennigen, Shannon Shen, Aniruddha Nrusimha, Bernhard Gapp, David Sontag, Yoon Kim
2024
COLM 2024
A Data-Centric Approach to Generate Faithful and High Quality Patient Summaries with Large Language Models
Patients often face difficulties in understanding their hospitalizations, while healthcare workers have limited resources to provide …
Stefan Hegselmann, Shannon Shen, Florian Gierse, Monica Agrawal, David Sontag, Xiaoyi Jiang
2024
Conference on Health, Inference, and Learning (CHIL) 2024
Benchmarking observational studies with experimental data under right-censoring
Drawing causal inferences from observational studies (OS) requires unverifiable validity assumptions; however, one can falsify those …
Ilker Demirel, Edward De Brouwer, Zeshan Hussain, Michael Oberst, Anthony A Philippakis, David Sontag
2024
AISTATS 2024
Machine learning to predict notes for chart review in the oncology setting: a proof of concept strategy for improving clinician note-writing
Objective: Leverage electronic health record (EHR) audit logs to develop a machine learning (ML) model that predicts which notes a …
Sharon Jiang, Barbara D Lam, Monica Agrawal, Shannon Shen, Nicholas Kurtzman, Steven Horng, David R Karger, David Sontag
2024
Journal of the American Medical Informatics Association (JAMIA)
Joint AI-driven event prediction and longitudinal modeling in newly diagnosed and relapsed multiple myeloma
Zeshan Hussain, Edward De Brouwer, Rebecca Boiarsky, Sama Setty, Neeraj Gupta, Guohui Liu, Cong Li, Jaydeep Srimani, Jacob Zhang, Rich Labotka, others
2024
NPJ Digital Medicine