Publications

← Back to Home

2026

No publications yet for 2026.

2025

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Zhoujun Cheng, Shibo Hao, Tianyang Liu, Fan Zhou, Yutao Xie, Feng Yao, Yuexin Bian, Yonghao Zhuang, Nilabjo Dey, Yuheng Zha, Yi Gu, Kun Zhou, Yuqi Wang, Yuan Li, Richard Fan, Jianshu She, Chengqian Gao, Abulhair Saparov, Haonan Li, Taylor W. Killian, Mikhail Yurochkin, Zhengzhong Liu, Eric P. Xing, Zhiting Hu
arXiv Preprint
Reinforcement learning has emerged as a promising approach to improve large language model reasoning, yet most open efforts focus narrowly on math and code, limiting our understanding of its broader applicability to general reasoning. We introduce Guru, a curated RL reasoning corpus spanning six reasoning domains.
Robust Autonomy Emerges from Self-Play
Marco Cusumano-Towner, David Hafner, Alex Hertzberg, Brody Huval, Aleksei Petrenko, Eugene Vinitsky, Erik Wijmans, Taylor W. Killian, Stuart Bowers, Ozan Sener, Philipp Krahenbuhl, Vladlen Koltun
ICML 2025
We developed a robust autonomous driving agent, in simulation, via self-play at massive scale. This simulator was designed to run in extensively parallel settings where we could aggressively randomize each agent's physical and behavior characteristics and generate substantial amounts of experience.

2024

Clinically Motivated Sequential Decision Making Under Uncertainty in Offline Settings
Taylor W. Killian
PhD Thesis, University of Toronto, Department of Computer Science
In order to develop practical machine learning aided technology for the benefit of human users, it is critical to anchor scientific research and development by the intended real-world use cases. In this thesis, I propose specific modeling decisions that can be made to develop actionable insights from sequentially observed healthcare data.

2023

Continuous Time Evidential Distributions for Irregular Time Series
Taylor W. Killian, Haoran Zhang, Thomas Hartvigsen, Ava Amini
Interpretable Machine Learning in Healthcare Workshop, ICML 2023
We extend recent evidential deep learning approaches to sequential settings in continuous time to deal with irregularly sampled time series such as those one encounters in healthcare. This method provides stable, temporally correlated predictions and corresponding well calibrated uncertainty estimates based on the evidence gained with each collected observation.
Risk Sensitive Dead-end Identification in Safety-Critical Offline Reinforcement Learning
Taylor W. Killian, Sonali Parbhoo, Marzyeh Ghassemi
Transactions on Machine Learning Research (TMLR)
We improve upon our prior dead-ends work by taking a risk-sensitive approach to dead-end discovery, leveraging distributional RL for value estimation. This allows for earlier indication of dead-ends in a manner that is tunable based on the risk tolerance of the designed task.

2022

Continuous Time Evidential Distributions for Processing Irregular Time Series
Taylor W. Killian, Ava Amini
Learning from Time Series for Health Workshop at NeurIPS
Identifying Disparities in Sepsis Treatment using Inverse Reinforcement Learning
Hyewon Jeong, Taylor W. Killian, Sanjat Kanjilal, Siddharth Nayak, Marzyeh Ghassemi
WiML: Women in Machine Learning and RL4RealLife workshops at NeurIPS
Counterfactually Guided Policy Transfer in Clinical Settings
Taylor W. Killian, Marzyeh Ghassemi, Shalmali Joshi
Conference on Health, Inference and Learning (CHIL), 2022

2021

Medical Dead-ends and Learning to Identify High-Risk States and Treatments
Mehdi Fatemi, Taylor W. Killian, Jayakumar Subramanian, Marzyeh Ghassemi
Neural Information Processing Systems, 2021
In data-constrained offline settings optimal sequential decision policies may not be attainable. However, negative outcomes in data can be used to identify behaviors to avoid, thereby guarding against overoptimistic decisions in safety-critical domains that may be significantly biased due to reduced data availability.

2020

An Empirical Study of Representation Learning for Reinforcement Learning in Healthcare
Taylor W. Killian, Haoran Zhang, Jayakumar Subramanian, Mehdi Fatemi, Maryzeh Ghassemi
ML4H: Machine Learning for Health Workshop at NeurIPS
Multiple Sclerosis Severity Classification From Clinical Text
Alister D'Costa, Stefan Denkovski, Michal Malyska, Sae Young Moon, Brandon Rufino, Zhen Yang, Taylor W. Killian, Marzyeh Ghassemi
The 3rd Clinical Natural Language Processing Workshop
Counterfactual Transfer via Inductive Bias in Clinical Settings
Taylor W. Killian, Marzyeh Ghassemi, Shalmali Joshi
Inductive Biases, Invariances and Generalization in RL (BIG) ICML Workshop
Optimization Methods for Interpretable Differentiable Decision Trees Applied to Reinforcement Learning
Andrew Silva, Taylor W. Killian, Ivan Rodriguez Jimenez, Sung-Hyun Son, Matthew Gombolay
The 23rd International Conference on Artificial Intelligence and Statistics (AISTATS)

2019

Kernelized Capsule Networks
Taylor W. Killian, Justin Goodwin, Olivia Brown, Sung-Hyun Son
1st Workshop on Understanding and Improving Generalization in Deep Learning, ICML

2018

Direct Policy Transfer with Hidden Parameter Markov Decision Processes
Jiayu Yao, Taylor W. Killian, George Konidaris, Finale Doshi-Velez
Lifelong Learning: A Reinforcement Learning Approach Workshop at FAIM 2018

2017

Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes
Taylor W. Killian, Samuel Daulton, George Konidaris, Finale Doshi-Velez
Neural Information Processing Systems, pp. 6245-6250, 2017
Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes
Taylor W. Killian, George Konidaris, Finale Doshi-Velez
AAAI, pp.4949-4950. 2017

2012

Rebound and jet formation of a fluid-filled sphere
Taylor W. Killian, Robert A. Klaus, and Tadd T. Truscott
Physics of Fluids, 24 122106. 2012.