Biomedical Informatics and Health Equity Monthly Seminar Series Registration
Machine Learning for Real-World EHR Data: Learning from Sparse and Constrained Clinical Annotations
Wednesday, April 29, 2026
12:00 pm – 1:00 pm
Real-world Electronic Health Records (EHR) in primary care settings pose significant challenges for machine learning, including sparse and noisy temporal annotations, limited labels, and structured clinical outcomes. In this talk, Dr. Romeo will present three research directions that address these issues across different prediction scenarios: methods for early risk prediction from temporally sparse EHR data, based on extensions of Multiple Instance Learning; spatiotemporal multi-task learning approaches designed to capture dependencies across multiple clinical outcomes and treatments; and models that incorporate hierarchical and ordinal constraints, allowing a more accurate representation of structured outcomes such as disease severity. Dr. Romeo will also briefly touch on ongoing work, building on these directions, aimed at developing unified ordinal models that jointly integrate multiple EHR views and prediction tasks.
While these use cases differ from a clinical perspective, they are connected by a common idea: designing machine learning models that are aware of the structure and limitations of real-world EHR data. Rather than relying on generic models, the goal is to develop task-aware approaches tailored to specific prediction challenges and pattern recognition problems.
Presenter:
Luca Romeo, PhD
Associate Professor, Department of Economics and Law
University of Macerata, Italy
Sponsored by the Office of the Vice Chancellor for Research, College of Medicine, College of Applied Health Sciences, and Institute for Health Data Science Research.Â