no code implementations • 9 Mar 2024 • Marcel Hussing, Claas Voelcker, Igor Gilitschenski, Amir-Massoud Farahmand, Eric Eaton
We show that deep reinforcement learning can maintain its ability to learn without resetting network parameters in settings where the number of gradient updates greatly exceeds the number of environment samples.
no code implementations • 30 Nov 2023 • Avery Ma, Amir-Massoud Farahmand, Yangchen Pan, Philip Torr, Jindong Gu
During the alignment process, the parameters of the source model are fine-tuned to minimize an alignment loss.
no code implementations • 29 Nov 2023 • Amin Rakhsha, Mete Kemertas, Mohammad Ghavamzadeh, Amir-Massoud Farahmand
We propose and theoretically analyze an approach for planning with an approximate model in reinforcement learning that can reduce the adverse impact of model error.
1 code implementation • 13 Aug 2023 • Avery Ma, Yangchen Pan, Amir-Massoud Farahmand
In the context of deep learning, our experiments show that SGD-trained neural networks have smaller Lipschitz constants, explaining the better robustness to input perturbations than those trained with adaptive gradient methods.
1 code implementation • 17 Jul 2023 • Mete Kemertas, Allan D. Jepson, Amir-Massoud Farahmand
We design a novel algorithm for optimal transport by drawing from the entropic optimal transport, mirror descent and conjugate gradients literatures.
1 code implementation • NeurIPS 2023 • Tyler Kastner, Murat A. Erdogdu, Amir-Massoud Farahmand
We consider the problem of learning models for risk-sensitive reinforcement learning.
Distributional Reinforcement Learning reinforcement-learning
no code implementations • 30 Jun 2023 • Claas A Voelcker, Arash Ahmadian, Romina Abachi, Igor Gilitschenski, Amir-Massoud Farahmand
The idea of decision-aware model learning, that models should be accurate where it matters for decision-making, has gained prominence in model-based reinforcement learning.
no code implementations • 25 Nov 2022 • Amin Rakhsha, Andrew Wang, Mohammad Ghavamzadeh, Amir-Massoud Farahmand
We introduce new planning and reinforcement learning algorithms for discounted MDPs that utilize an approximate model of the environment to accelerate the convergence of the value function.
1 code implementation • ICLR 2022 • Claas Voelcker, Victor Liao, Animesh Garg, Amir-Massoud Farahmand
However, they tend to be inferior in practice to commonly used maximum likelihood (MLE) based approaches.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • NeurIPS Workshop DLDE 2021 • Erfan Pirmorad, Faraz Khoshbakhtian, Farnam Mansouri, Amir-Massoud Farahmand
In many areas, such as the physical sciences, life sciences, and finance, control approaches are used to achieve a desired goal in complex dynamical systems governed by differential equations.
no code implementations • ICLR 2022 • Guiliang Liu, Ashutosh Adhikari, Amir-Massoud Farahmand, Pascal Poupart
The advancement of dynamics models enables model-based planning in complex environments.
no code implementations • 5 Oct 2020 • Rodrigo Toro Icarte, Richard Valenzano, Toryn Q. Klassen, Phillip Christoffersen, Amir-Massoud Farahmand, Sheila A. McIlraith
Learning memoryless policies is efficient and optimal in fully observable environments.
Partially Observable Reinforcement Learning reinforcement-learning +1
1 code implementation • 28 Sep 2020 • Jincheng Mei, Yangchen Pan, Martha White, Amir-Massoud Farahmand, Hengshuai Yao
The prioritized Experience Replay (ER) method has attracted great attention; however, there is little theoretical understanding of such prioritization strategy and why they help.
2 code implementations • 19 Jul 2020 • Yangchen Pan, Jincheng Mei, Amir-Massoud Farahmand, Martha White, Hengshuai Yao, Mohsen Rohani, Jun Luo
Prioritized Experience Replay (ER) has been empirically shown to improve sample efficiency across many domains and attracted great attention; however, there is little theoretical understanding of why such prioritized sampling helps and its limitations.
no code implementations • 4 Apr 2020 • Avery Ma, Fartash Faghri, Nicolas Papernot, Amir-Massoud Farahmand
Adversarial training is a common approach to improving the robustness of deep neural networks against adversarial examples.
1 code implementation • 28 Feb 2020 • Romina Abachi, Mohammad Ghavamzadeh, Amir-Massoud Farahmand
This is in contrast to conventional model learning approaches, such as those based on maximum likelihood estimate, that learn a predictive model of the environment without explicitly considering the interaction of the model and the planner.
no code implementations • ICLR 2020 • Yangchen Pan, Jincheng Mei, Amir-Massoud Farahmand
This suggests a search-control strategy: we should use states from high frequency regions of the value function to query the model to acquire more samples.
no code implementations • NeurIPS 2020 • Yangchen Pan, Ehsan Imani, Martha White, Amir-Massoud Farahmand
We empirically demonstrate on several synthetic problems that our method (i) can learn multi-valued functions and produce the conditional modes, (ii) scales well to high-dimensional inputs, and (iii) can even be more effective for certain uni-modal problems, particularly for high-frequency functions.
no code implementations • NeurIPS 2019 • Amir-Massoud Farahmand
We call the new representation Characteristic Value Function (CVF), which can be interpreted as the frequency domain representation of the probability distribution of returns.
no code implementations • 18 Jun 2019 • Yangchen Pan, Hengshuai Yao, Amir-Massoud Farahmand, Martha White
In this work, we propose to generate such states by using the trajectory obtained from Hill Climbing (HC) the current estimate of the value function.
Model-based Reinforcement Learning Reinforcement Learning (RL)
no code implementations • ICLR 2019 • Marc T. Law, Jake Snell, Amir-Massoud Farahmand, Raquel Urtasun, Richard S. Zemel
Most deep learning models rely on expressive high-dimensional representations to achieve good performance on tasks such as classification.
no code implementations • 8 Mar 2019 • Mohamed Akrout, Amir-Massoud Farahmand, Tory Jarmain, Latif Abid
Moreover, the increased accuracy is up to 10% compared to the approach that uses the visual information provided by CNN along with a conventional decision tree-based QA system.
no code implementations • NeurIPS 2018 • Amir-Massoud Farahmand
This paper introduces a model-based reinforcement learning (MBRL) framework that incorporates the underlying decision problem in learning the transition model of the environment.
no code implementations • 15 Nov 2018 • Mohamed Akrout, Amir-Massoud Farahmand, Tory Jarmain
We present a skin condition classification methodology based on a sequential pipeline of a pre-trained Convolutional Neural Network (CNN) and a Question Answering (QA) model.
no code implementations • ICML 2018 • Yangchen Pan, Amir-Massoud Farahmand, Martha White, Saleh Nabi, Piyush Grover, Daniel Nikovski
Recent work has shown that reinforcement learning (RL) is a promising approach to control dynamical systems described by partial differential equations (PDE).
no code implementations • NeurIPS 2017 • Amir-Massoud Farahmand, Sepideh Pourazarm, Daniel Nikovski
Different filters in RPFB extract different aspects of the time series, and together they provide a reasonably good summary of the time series.
no code implementations • 6 Feb 2017 • Kota Hara, Ming-Yu Liu, Oncel Tuzel, Amir-Massoud Farahmand
We propose augmenting deep neural networks with an attention mechanism for the visual object detection task.
no code implementations • 2 Jul 2014 • Amir-Massoud Farahmand, Doina Precup, André M. S. Barreto, Mohammad Ghavamzadeh
We introduce a general classification-based approximate policy iteration (CAPI) framework, which encompasses a large class of algorithms that can exploit regularities of both the value function and the policy space, depending on what is advantageous.
no code implementations • NeurIPS 2013 • Beomjoon Kim, Amir-Massoud Farahmand, Joelle Pineau, Doina Precup
We achieve this by integrating LfD in an approximate policy iteration algorithm.
no code implementations • NeurIPS 2013 • Mahdi Milani Fard, Yuri Grinberg, Amir-Massoud Farahmand, Joelle Pineau, Doina Precup
This paper addresses the problem of automatic generation of features for value function approximation in reinforcement learning.
no code implementations • NeurIPS 2011 • Amir-Massoud Farahmand
Many practitioners of reinforcement learning problems have observed that oftentimes the performance of the agent reaches very close to the optimal performance even though the estimated (action-)value function is still far from the optimal one.
no code implementations • NeurIPS 2010 • Amir-Massoud Farahmand, Csaba Szepesvári, Rémi Munos
We address the question of how the approximation error/Bellman residual at each iteration of the Approximate Policy/Value Iteration algorithms influences the quality of the resulted policy.