4 code implementations • NeurIPS 2018 • Mohammadreza Nazari, Afshin Oroojlooy, Lawrence V. Snyder, Martin Takáč
Our model represents a parameterized stochastic policy, and by applying a policy gradient algorithm to optimize its parameters, the trained model produces the solution as a sequence of consecutive actions in real time, without the need to re-train for every new problem instance.
1 code implementation • 13 May 2019 • Hossein K. Mousavi, MohammadReza Nazari, Martin Takáč, Nader Motee
We investigate a classification problem using multiple mobile agents capable of collecting (partial) pose-dependent observations of an unknown environment.
no code implementations • 20 Aug 2017 • Afshin Oroojlooyjadid, MohammadReza Nazari, Lawrence Snyder, Martin Takáč
The game is a decentralized, multi-agent, cooperative problem that can be modeled as a serial supply chain network in which agents cooperatively attempt to minimize the total cost of the network even though each agent can only observe its own local information.
no code implementations • 30 May 2019 • Majid Jahani, MohammadReza Nazari, Sergey Rusakov, Albert S. Berahas, Martin Takáč
In this paper, we present a scalable distributed implementation of the Sampled Limited-memory Symmetric Rank-1 (S-LSR1) algorithm.
no code implementations • 30 May 2019 • Mohammadreza Nazari, Majid Jahani, Lawrence V. Snyder, Martin Takáč
Therefore, we propose a student-teacher RL mechanism in which the RL (the "student") learns to maximize its reward, subject to a constraint that bounds the difference between the RL policy and the "teacher" policy.
no code implementations • 6 Jun 2020 • Majid Jahani, MohammadReza Nazari, Rachael Tappenden, Albert S. Berahas, Martin Takáč
This work presents a new algorithm for empirical risk minimization.
no code implementations • NeurIPS 2020 • Afshin Oroojlooy, MohammadReza Nazari, Davood Hajinezhad, Jorge Silva
The first attention model is introduced to handle different numbers of roads-lanes; and the second attention model is intended for enabling decision-making with any number of phases in an intersection.