Overview
Developing risk-sensitive methods for identifying dangerous states and treatments in healthcare settings. Focus on dead-end identification using distributional RL and conservative value estimation for improved patient safety.
Motivation
In safety-critical domains like healthcare, optimal policies may not be attainable from limited offline data. However, negative outcomes can be leveraged to identify behaviors to avoid, guarding against overoptimistic decisions.
Technical Approach
Risk-Sensitive Dead-End Identification
- Distributional RL: Model value distributions rather than expected values
- Conditional Value-at-Risk (CVaR): Account for worst-case outcomes
- Tunable Risk Tolerance: Adjust conservatism based on application needs
Dead-End Discovery Framework
- State Construction: Learn meaningful patient state representations
- Dead-End Discovery: Identify high-risk states using distributional value estimates
- Dead-End Confirmation: Validate discovered dead-ends through multiple independent models
Key Results
- Earlier identification of dangerous states compared to expectation-based methods
- Tunable risk sensitivity enables domain-expert control
- Framework applicable across different healthcare settings (sepsis, diabetes, etc.)
Publications
Risk Sensitive Dead-end Identification in Safety-Critical Offline Reinforcement Learning (TMLR 2023)
Medical Dead-ends and Learning to Identify High-Risk States and Treatments (NeurIPS 2021)
Impact
This work provides practical tools for clinicians to identify potentially dangerous treatment paths, complementing existing clinical guidelines with data-driven safety analysis.