They also form a natural bridge between model-based and model-free RL methods: like the former they make predictions about future experiences, and like the latter they allow efficient prediction of total discounted rewards.
With the widespread deployment of large-scale prediction systems in high-stakes domains, e. g., face recognition, criminal justice, etc., disparity in prediction accuracy between different demographic subgroups has called for fundamental understanding on the source of such disparity and algorithmic intervention to mitigate it.
A wide range of machine learning applications such as privacy-preserving learning, algorithmic fairness, and domain adaptation/generalization among others, involve learning invariant representations of the data that aim to achieve two competing goals: (a) maximize information or accuracy with respect to a target response, and (b) maximize invariance or independence with respect to a set of protected features (e. g., for fairness, privacy, etc).
This approach leads to mismatches as, during training, the model is not exposed to its mistakes and does not use beam search.
We propose a novel algorithm for learning fair representations that can simultaneously mitigate two notions of disparity among different demographic subgroups in the classification setting.
With the prevalence of machine learning services, crowdsourced data containing sensitive information poses substantial privacy challenges.
Meanwhile, it is clear that in general there is a tension between minimizing information leakage and maximizing task accuracy.
On the upside, we prove that if the group-wise Bayes optimal classifiers are close, then learning fair representations leads to an alternative notion of fairness, known as the accuracy parity, which states that the error rates are close between groups.
Our result characterizes a fundamental tradeoff between learning invariant representations and achieving small joint error on both domains when the marginal label distributions differ from source to target.
Inspired by the phenomenon of catastrophic forgetting, we investigate the learning dynamics of neural networks as they train on single classification tasks.
In this paper we propose new generalization bounds and algorithms under both classification and regression settings for unsupervised multiple source domain adaptation.
Ranked #3 on Domain Adaptation on GTA5+Synscapes to Cityscapes
Recently, a novel class of Approximate Policy Iteration (API) algorithms have demonstrated impressive practical performance (e. g., ExIt from , AlphaGo-Zero from ).
We propose a new generalization bound for domain adaptation when there are multiple source domains with labeled instances and one target domain with unlabeled instances.
As a step toward bridging the gap, we propose a new generalization bound for domain adaptation when there are multiple source domains with labeled instances and one target domain with unlabeled instances.
We demonstrate that AggreVaTeD --- a policy gradient extension of the Imitation Learning (IL) approach of (Ross & Bagnell, 2014) --- can leverage such an oracle to achieve faster and better solutions with less training data than a less-informed Reinforcement Learning (RL) technique.
We propose a framework for modeling and estimating the state of controlled dynamical systems, where an agent can affect the system through actions and receives partial observations.
A set of molecular descriptors whose length is independent of molecular size is developed for machine learning models that target thermodynamic and electronic properties of molecules.
One promising idea is Galerkin approximation, in which we search for the best answer within the span of a given set of basis functions.
We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identification.
Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i. i. d.
We introduce the Reduced-Rank Hidden Markov Model (RR-HMM), a generalization of HMMs that can model smooth state evolution as in Linear Dynamical Systems (LDSs) as well as non-log-concave predictive distributions as in continuous-observation HMMs.