Publications

2026

No publications yet for 2026.

2025

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Zhoujun Cheng, Shibo Hao, Tianyang Liu, Fan Zhou, Yutao Xie, Feng Yao, Yuexin Bian, Yonghao Zhuang, Nilabjo Dey, Yuheng Zha, Yi Gu, Kun Zhou, Yuqi Wang, Yuan Li, Richard Fan, Jianshu She, Chengqian Gao, Abulhair Saparov, Haonan Li, Taylor W. Killian, Mikhail Yurochkin, Zhengzhong Liu, Eric P. Xing, Zhiting Hu

arXiv Preprint

Reinforcement learning has emerged as a promising approach to improve large language model reasoning, yet most open efforts focus narrowly on math and code, limiting our understanding of its broader applicability to general reasoning. We introduce Guru, a curated RL reasoning corpus spanning six reasoning domains.

Website Paper Code Data Models

Robust Autonomy Emerges from Self-Play

Marco Cusumano-Towner, David Hafner, Alex Hertzberg, Brody Huval, Aleksei Petrenko, Eugene Vinitsky, Erik Wijmans, Taylor W. Killian, Stuart Bowers, Ozan Sener, Philipp Krahenbuhl, Vladlen Koltun

ICML 2025

We developed a robust autonomous driving agent, in simulation, via self-play at massive scale. This simulator was designed to run in extensively parallel settings where we could aggressively randomize each agent's physical and behavior characteristics and generate substantial amounts of experience.

Paper

2024

Clinically Motivated Sequential Decision Making Under Uncertainty in Offline Settings

Taylor W. Killian

PhD Thesis, University of Toronto, Department of Computer Science

In order to develop practical machine learning aided technology for the benefit of human users, it is critical to anchor scientific research and development by the intended real-world use cases. In this thesis, I propose specific modeling decisions that can be made to develop actionable insights from sequentially observed healthcare data.

Thesis

2023

Continuous Time Evidential Distributions for Irregular Time Series

Taylor W. Killian, Haoran Zhang, Thomas Hartvigsen, Ava Amini

Interpretable Machine Learning in Healthcare Workshop, ICML 2023

We extend recent evidential deep learning approaches to sequential settings in continuous time to deal with irregularly sampled time series such as those one encounters in healthcare. This method provides stable, temporally correlated predictions and corresponding well calibrated uncertainty estimates based on the evidence gained with each collected observation.

Paper Code

Risk Sensitive Dead-end Identification in Safety-Critical Offline Reinforcement Learning

Taylor W. Killian, Sonali Parbhoo, Marzyeh Ghassemi

Transactions on Machine Learning Research (TMLR)

We improve upon our prior dead-ends work by taking a risk-sensitive approach to dead-end discovery, leveraging distributional RL for value estimation. This allows for earlier indication of dead-ends in a manner that is tunable based on the risk tolerance of the designed task.

Paper Forum Code

2022

Continuous Time Evidential Distributions for Processing Irregular Time Series

Taylor W. Killian, Ava Amini

Learning from Time Series for Health Workshop at NeurIPS

Paper

Identifying Disparities in Sepsis Treatment using Inverse Reinforcement Learning

Hyewon Jeong, Taylor W. Killian, Sanjat Kanjilal, Siddharth Nayak, Marzyeh Ghassemi

WiML: Women in Machine Learning and RL4RealLife workshops at NeurIPS

Paper

Counterfactually Guided Policy Transfer in Clinical Settings

Taylor W. Killian, Marzyeh Ghassemi, Shalmali Joshi

Conference on Health, Inference and Learning (CHIL), 2022

Paper Poster

2021

Medical Dead-ends and Learning to Identify High-Risk States and Treatments

Mehdi Fatemi, Taylor W. Killian, Jayakumar Subramanian, Marzyeh Ghassemi

Neural Information Processing Systems, 2021

In data-constrained offline settings optimal sequential decision policies may not be attainable. However, negative outcomes in data can be used to identify behaviors to avoid, thereby guarding against overoptimistic decisions in safety-critical domains that may be significantly biased due to reduced data availability.

Paper Poster Code MSR Blog

2020

An Empirical Study of Representation Learning for Reinforcement Learning in Healthcare

Taylor W. Killian, Haoran Zhang, Jayakumar Subramanian, Mehdi Fatemi, Maryzeh Ghassemi

ML4H: Machine Learning for Health Workshop at NeurIPS

Paper Poster Code

Multiple Sclerosis Severity Classification From Clinical Text

Alister D'Costa, Stefan Denkovski, Michal Malyska, Sae Young Moon, Brandon Rufino, Zhen Yang, Taylor W. Killian, Marzyeh Ghassemi

The 3rd Clinical Natural Language Processing Workshop

Paper Model

Counterfactual Transfer via Inductive Bias in Clinical Settings

Taylor W. Killian, Marzyeh Ghassemi, Shalmali Joshi

Inductive Biases, Invariances and Generalization in RL (BIG) ICML Workshop

Paper

Optimization Methods for Interpretable Differentiable Decision Trees Applied to Reinforcement Learning

Andrew Silva, Taylor W. Killian, Ivan Rodriguez Jimenez, Sung-Hyun Son, Matthew Gombolay

The 23rd International Conference on Artificial Intelligence and Statistics (AISTATS)

Paper

2019

Kernelized Capsule Networks

Taylor W. Killian, Justin Goodwin, Olivia Brown, Sung-Hyun Son

1st Workshop on Understanding and Improving Generalization in Deep Learning, ICML

Paper Poster

2018

Direct Policy Transfer with Hidden Parameter Markov Decision Processes

Jiayu Yao, Taylor W. Killian, George Konidaris, Finale Doshi-Velez

Lifelong Learning: A Reinforcement Learning Approach Workshop at FAIM 2018

Paper Poster Slides

2017

Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes

Taylor W. Killian, Samuel Daulton, George Konidaris, Finale Doshi-Velez

Neural Information Processing Systems, pp. 6245-6250, 2017

Paper Poster Code Slides Video

Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes

Taylor W. Killian, George Konidaris, Finale Doshi-Velez

AAAI, pp.4949-4950. 2017

Paper Poster Slides

2012

Rebound and jet formation of a fluid-filled sphere

Taylor W. Killian, Robert A. Klaus, and Tadd T. Truscott

Physics of Fluids, 24 122106. 2012.

Paper Slides Video