no code implementations • ICML 2018 • Ehsan Imani, Martha White
We provide theoretical support for this alternative hypothesis, by characterizing the norm of the gradients of this loss.
no code implementations • NeurIPS 2018 • Ehsan Imani, Eric Graves, Martha White
There have been a host of theoretically sound algorithms proposed for the on-policy setting, due to the existence of the policy gradient theorem which provides a simplified form for the gradient.
no code implementations • NeurIPS 2020 • Yangchen Pan, Ehsan Imani, Martha White, Amir-Massoud Farahmand
We empirically demonstrate on several synthetic problems that our method (i) can learn multi-valued functions and produce the conditional modes, (ii) scales well to high-dimensional inputs, and (iii) can even be more effective for certain uni-modal problems, particularly for high-frequency functions.
no code implementations • 8 Jun 2020 • Taher Jafferjee, Ehsan Imani, Erin Talvitie, Martha White, Micheal Bowling
Dyna-style reinforcement learning (RL) agents improve sample efficiency over model-free RL agents by updating the value function with simulated experience generated by an environment model.
1 code implementation • 16 Nov 2021 • Eric Graves, Ehsan Imani, Raksha Kumaraswamy, Martha White
A variety of theoretically-sound policy gradient algorithms exist for the on-policy setting due to the policy gradient theorem, which provides a simplified form for the gradient.
1 code implementation • 15 Dec 2021 • Ehsan Imani, Wei Hu, Martha White
We then highlight why alignment between the top singular vectors and the targets can speed up learning and show in a classic synthetic transfer problem that representation alignment correlates with positive and negative transfer to similar and dissimilar tasks.
no code implementations • 27 Nov 2022 • Ehsan Imani, Guojun Zhang, Runjia Li, Jun Luo, Pascal Poupart, Philip H. S. Torr, Yangchen Pan
Recent work has highlighted the label alignment property (LAP) in supervised learning, where the vector of all labels in the dataset is mostly in the span of the top few singular vectors of the data matrix.
1 code implementation • 20 Feb 2024 • Ehsan Imani, Kai Luedemann, Sam Scholnick-Hughes, Esraa Elelimy, Martha White
It is becoming increasingly common in regression to train neural networks that model the entire distribution even if only the mean is required for prediction.