In this work, we study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy, where the reinforcement signal is provided by a global (but stepwise-decomposable) energy model trained by contrastive estimation.
Significant effort has been placed on developing decision support tools to improve patient care.
By leveraging data from multiple environments, we propose Invariant Causal Imitation Learning (ICIL), a novel technique in which we learn a feature representation that is invariant across domains, on the basis of which we learn an imitation policy that matches expert behavior.
Understanding a decision-maker's priorities by observing their behavior is critical for transparency and accountability in decision processes, such as in healthcare.
Understanding decision-making in clinical environments is of paramount importance if we are to bring the strengths of machine learning to ultimately improve patient outcomes.
Understanding human behavior from observed data is critical for transparency and accountability in decision-making.
Despite exponential growth in electronic patient data, there is a remarkable gap between the potential and realized utilization of ML for clinical research and decision support.
The clinical time-series setting poses a unique combination of challenges to data modeling and sharing.
Building interpretable parameterizations of real-world decision-making on the basis of demonstrated behavior -- i. e. trajectories of observations and actions made by an expert maximizing some unknown reward function -- is essential for introspecting and auditing policies in different institutions.
Through experiments with application to control and healthcare settings, we illustrate consistent performance gains over existing algorithms for strictly batch imitation learning.
Finally, we illustrate how this formulation enables understanding decision-making behavior by quantifying preferences implicit in observed decision strategies (the inverse problem).
Autoencoder-based learning has emerged as a staple for disciplining representations in unsupervised and semi-supervised settings.
In this paper, we propose a novel Bayesian optimization (BO) algorithm to tackle the challenge of model selection in this setting.
A good generative model for time-series data should preservetemporal dynamics, in the sense that new sequences respect the original relationships between variablesacross time.
A good generative model for time-series data should preserve temporal dynamics, in the sense that new sequences respect the original relationships between variables across time.
Accurate prediction of disease trajectories is critical for early identification and timely treatment of patients at risk.