2 code implementations • ICLR 2022 • Maximilian Seitzer, Arash Tavakoli, Dimitrije Antic, Georg Martius
In this work, we examine this approach and identify potential hazards associated with the use of log-likelihood in conjunction with gradient-based optimizers.
1 code implementation • ICLR 2022 • Mehdi Fatemi, Arash Tavakoli
We present a general convergent class of reinforcement learning algorithms that is founded on two distinct principles: (1) mapping value estimates to a different space using arbitrary functions from a broad class, and (2) linearly decomposing the reward signal into multiple channels.
1 code implementation • ICLR 2021 • Arash Tavakoli, Mehdi Fatemi, Petar Kormushev
To test this, we set forth the action hypergraph networks framework -- a class of functions for learning action representations in multi-dimensional discrete action spaces with a structural inductive bias.
1 code implementation • 24 Jul 2019 • Tamás Kriváchy, Yu Cai, Daniel Cavalcanti, Arash Tavakoli, Nicolas Gisin, Nicolas Brunner
As such, the neural network acts as an oracle, demonstrating that a behavior is classical if it can be learned.
2 code implementations • NeurIPS 2019 • Harm van Seijen, Mehdi Fatemi, Arash Tavakoli
In an effort to better understand the different ways in which the discount factor affects the optimization process in reinforcement learning, we designed a set of experiments to study each effect in isolation.
no code implementations • 27 Nov 2018 • Arash Tavakoli, Vitaly Levdik, Riashat Islam, Christopher M. Smith, Petar Kormushev
We consider the generic approach of using an experience memory to help exploration by adapting a restart distribution.
1 code implementation • ICML 2018 • Fabio Pardo, Arash Tavakoli, Vitaly Levdik, Petar Kormushev
In case (ii), the time limits are not part of the environment and are only used to facilitate learning.
5 code implementations • 24 Nov 2017 • Arash Tavakoli, Fabio Pardo, Petar Kormushev
This approach achieves a linear increase of the number of network outputs with the number of degrees of freedom by allowing a level of independence for each individual action dimension.
no code implementations • 20 Apr 2016 • Arash Tavakoli, Haig Nalbandian, Nora Ayanian
This ability, if learned as a set of distributed multirobot coordination strategies, can enable programming large groups of robots to collaborate towards complex coordination objectives in a way similar to humans.