I am a fourth year PhD student at the University of Toronto and Vector Institute under Marzyeh Ghassemi and am a member of the Healthy ML lab. Among many exciting things, I broadly investigate novel applications of Reinforcement Learning to assist clinical decision making. Recently, our lab moved to the Massachusetts Institute of Technology through the Institute of Medical Engineering and Sciences where I will be visiting for the remainder of my PhD.

I work in the fields of reinforcement learning, machine learning, and causal inference. I have long been interested in decision making and the mechanisms by which humans summarize and reason about the world. In my work, I aim to develop models and algorithms that enable actors (whether human or not) to efficiently make decisions in the face of various forms of uncertainty.

I have been fortunate to collaborate with multiple top research institutions, including Microsoft Research (long standing relationship with Mehdi Fatemi, 2022 research intern with Ava Amini), Google Brain (2020 research intern with Marlos Machado and Marc Bellemare), Apple Health AI (2021 research intern with Leon Gatys), and St. Michael's Hospital. I'm always keen on hearing about interesting ideas and love collaborating with others on a variety of problems, applied and foundational, as far as there is alignment with my areas of focus. Don't hesitate to reach out!

Previously, I was employed at MIT Lincoln Laboratory and completed degrees at Harvard University (working with Finale Doshi-Velez) and Brigham Young University (working with Tadd Truscott).




Some of my work is available as preprints on arXiv.

Having lived in Sweden, I put together a brief guide to acquaint co-workers and colleagues with Stockholm. You can find the guide here


Risk Senstivie Dead-end Identification in Safety-Critical Offline Reinforcement Learning
We improve upon our prior dead-ends work by taking a risk-senstive approach to dead-end discovery, leveraging distributional RL for value estimation. With constructed value distributions, we can account for possible worst-case outcomes through conditional value-at-risk (CVaR) which enable a more conservative evaluation of the risk of possible treatments and patient health states. This allows for earlier indication of dead-ends in a manner that is tunable based on the risk tolerance of the designed task.
Taylor W. Killian, Sonali Parbhoo, Marzyeh Ghassemi
Transactions on Machine Learning Research


Continuous Time Evidential Distributions for Processing Irregular Time Series
We extend recent evidential deep learning approaches to sequential settings in continuous time to deal with irregularly sampled time series such as those one encounters in healthcare. This method provides stable, temporally correlated predictions and corresponding uncertainty estimates based on the evidence gained with each collected observation. The continuous time evidential distribution enables flexible inference of the evolution of the partially observed features at any time of interest, while expanding uncertainty temporally for sparse, irregular observations.
Taylor W. Killian, Ava Amini
Learning from Time Series for Health Workshop at NeurIPS

Identifying Disparities in Sepsis Treatment using Inverse Reinforcement Learning
We estimate counterfactual optimal policies (estimated with inverse RL from recorded behavioral data) via subsets of unseen medical populations and identify the difference in care by comparing it to the learned factual policy. We do this to identify deviations across sub-populations of interest and hope this approach helps to identify disparities in care and possible sources of bias underlying them.
Hyewon Jeong, Taylor W. Killian, Sanjat Kanjilal, Siddharth Nayak, Marzyeh Ghassemi
WiML: Women in Machine Learning and RL4RealLife workshops at NeurIPS

Counterfactually Guided Policy Transfer in Clinical Settings
Domain shift creates significant challenges for sequential decision making in healthcare since the target domain may be both data-scarce and confounded. In this paper, we propose a method for off-policy transfer by modeling the underlying generative process with a causal mechanism. We use informative priors from the source domain to augment counterfactual trajectories in the target in a principled manner. Policy learning in the target domain is further regularized via the source policy through KL-divergence.
Taylor W. Killian, Marzyeh Ghassemi, Shalmali Joshi
Conference on Health, Inference and Learning (CHIL), 2022


Medical Dead-ends and Learning to Identify High-Risk States and Treatments
In data-constrained offline settings optimal sequential decision policies may not be attainable. However, negative outcomes in data can be used to identify behaviors to avoid, thereby guarding against overoptimistic decisions in safety-critical domains that may be significantly biased due to reduced data availability. Along these lines we introduce an approach that identifies possible "dead-ends" of a state space as well as high-risk treatments that likely lead to them. We frame the discovery of these dead-ends as an RL problem, training three independent deep neural models for automated state construction, dead-end discovery and confirmation.
Mehdi Fatemi, Taylor W. Killian, Jayakumar Subramanian, Marzyeh Ghassemi
Neural Information Processing Systems, 2021


An Empirical Study of Representation Learning for Reinforcement Learning in Healthcare
We investigate several information encoding approaches to develop state representations of patient health from sequential data. We evaluate these representations utility for predicting the next physiological patient observation as well as the development of treatment policies.
Taylor W. Killian, Haoran Zhang, Jayakumar Subramanian, Mehdi Fatemi, Maryzeh Ghassemi
ML4H: Machine Learning for Health Workshop at NeurIPS

Multiple Sclerosis Severity Classification From Clinical Text
We present the first publicly available transformer model trained on real clinical data other than MIMIC, specifically finetuned for the support of Multiple Sclerosis prediciton and treatment based on clincal consult notes. The model can be found here
Alister D'Costa, Stefan Denkovski, Michal Malyska, Sae Young Moon, Brandon Rufino, Zhen Yang, Taylor W. Killian, Marzyeh Ghassemi
The 3rd Clinical Natural Language Processing Workshop

Counterfactual Transfer via Inductive Bias in Clinical Settings
By using counterfactual inference, we establish an approach to transfer learning within offline, off-policy Reinforcement Learning that provides improved policy performance in data-scarce target environments.
Taylor W. Killian, Marzyeh Ghassemi, Shalmali Joshi
Inductive Biases, Invariances and Generalization in RL (BIG) ICML Workshop

Optimization Methods for Interpretable Differentiable Decision Trees Applied to Reinforcement Learning
We leverage a differentiable form of a decision tree for Reinforcement Learning which allows for online updates via SGD. From this decision tree, an interpretable policy is extracted. We analyze the optimization behavior of such classes of policies and demonstrate equitable or better performance over batch trained decision trees and similarly sized neural networks.
Andrew Silva, Taylor W. Killian, Ivan Rodriguez Jimenez, Sung-Hyun Son, Matthew Gombolay
The 23rd International Conference on Artificial Intelligence and Statistics (AISTATS)


Kernelized Capsule Networks
A hybrid Gaussian Process-Deep Neural Network approach, Kernelized Capsule Networks construct a GP kernel function from the feature representations of a Capsule Network. This combination provides a model robust to adversarial perturbations while also providing a mechanism to detect perturbed inputs.
Taylor W. Killian, Justin Goodwin, Olivia Brown, Sung-Hyun Son
1st Workshop on Understanding and Improving Generalization in Deep Learning


Direct Policy Transfer with Hidden Parameter Markov Decision Processes
An extension of the HiP-MDP framework presented in Killian and Daulton, et al (2017) wherein the latent parameters used to describe dynamical variations are included as input to a general policy trained from the optimal policies learned from past instances.
Jiayu Yao, Taylor W. Killian, George Konidaris, Finale Doshi-Velez
Lifelong Learning: A Reinforcement Learning Approach Workshop at FAIM 2018


Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes
A reformulation of the HiP-MDP to admit more robust and efficient transfer learning when deployed in complex environments with highly nonlinear dynamics.
Taylor W. Killian, Samuel Daulton, George Konidaris, Finale Doshi-Velez
Neural Information Processing Systems, pp. 6245-6250, 2017

Paper Poster Code Slides Video (starts at 17:15)

Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes
An extended abstract of some preliminary transfer learning work. Submitted to the Student Abstract track of AAAI 2017.
Taylor W. Killian, George Konidaris, Finale Doshi-Velez
AAAI, pp.4949-4950. 2017


Rebound and jet formation of a fluid-filled sphere
Investigation how fluid filled spheres have little to no rebound when dropped.
Taylor W. Killian, Robert A. Klaus, and Tadd T. Truscott
Physics of Fluids, 24 122106. 2012.