Overview

Developing risk-sensitive methods for identifying dangerous states and treatments in healthcare settings. Focus on dead-end identification using distributional RL and conservative value estimation for improved patient safety.

Motivation

In safety-critical domains like healthcare, optimal policies may not be attainable from limited offline data. However, negative outcomes can be leveraged to identify behaviors to avoid, guarding against overoptimistic decisions.


Technical Approach

Risk-Sensitive Dead-End Identification

  • Distributional RL: Model value distributions rather than expected values
  • Conditional Value-at-Risk (CVaR): Account for worst-case outcomes
  • Tunable Risk Tolerance: Adjust conservatism based on application needs

Dead-End Discovery Framework

  1. State Construction: Learn meaningful patient state representations
  2. Dead-End Discovery: Identify high-risk states using distributional value estimates
  3. Dead-End Confirmation: Validate discovered dead-ends through multiple independent models


Key Results

  • Earlier identification of dangerous states compared to expectation-based methods
  • Tunable risk sensitivity enables domain-expert control
  • Framework applicable across different healthcare settings (sepsis, diabetes, etc.)


Publications

Risk Sensitive Dead-end Identification in Safety-Critical Offline Reinforcement Learning (TMLR 2023)

Paper Forum Code

Medical Dead-ends and Learning to Identify High-Risk States and Treatments (NeurIPS 2021)

Paper Poster Code MSR Blog


Impact

This work provides practical tools for clinicians to identify potentially dangerous treatment paths, complementing existing clinical guidelines with data-driven safety analysis.