no code implementations • 11 Nov 2024 • Megh Thakkar, Yash More, Quentin Fournier, Matthew Riemer, Pin-Yu Chen, Amal Zouaq, Payel Das, Sarath Chandar
There is a growing interest in training domain-expert LLMs that excel in specific technical fields compared to their general-purpose instruction-tuned counterparts.
no code implementations • 28 Oct 2022 • Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Gerald Tesauro, Jonathan P. How
By directly comparing active equilibria to Nash equilibria in these examples, we find that active equilibria find more effective solutions than Nash equilibria, concluding that an active equilibrium is the desired solution for multiagent learning settings.
1 code implementation • 7 Mar 2022 • Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Michael Everett, Chuangchuang Sun, Gerald Tesauro, Jonathan P. How
An effective approach that has recently emerged for addressing this non-stationarity is for each agent to anticipate the learning of other agents and influence the evolution of future policies towards desirable behavior for its own benefit.
1 code implementation • 13 Dec 2021 • Matthew Riemer, Sharath Chandra Raparthy, Ignacio Cases, Gopeshh Subbaraj, Maximilian Puelma Touzel, Irina Rish
The mixing time of the Markov chain induced by a policy limits performance in real-world continual learning scenarios.
1 code implementation • 20 Sep 2021 • Marwa Abdulhai, Dong-Ki Kim, Matthew Riemer, Miao Liu, Gerald Tesauro, Jonathan P. How
Hierarchical reinforcement learning has focused on discovering temporally extended actions, such as options, that can provide benefits in problems requiring extensive exploration.
no code implementations • 25 Dec 2020 • Khimya Khetarpal, Matthew Riemer, Irina Rish, Doina Precup
In this article, we aim to provide a literature review of different formulations and approaches to continual reinforcement learning (RL), also known as lifelong or non-stationary RL.
no code implementations • 23 Nov 2020 • Tyler Malloy, Tim Klinger, Miao Liu, Matthew Riemer, Gerald Tesauro, Chris R. Sims
This paper introduces an information-theoretic constraint on learned policy complexity in the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) reinforcement learning algorithm.
1 code implementation • 31 Oct 2020 • Dong-Ki Kim, Miao Liu, Matthew Riemer, Chuangchuang Sun, Marwa Abdulhai, Golnaz Habibi, Sebastian Lopez-Cot, Gerald Tesauro, Jonathan P. How
A fundamental challenge in multiagent reinforcement learning is to learn beneficial behaviors in a shared environment with other simultaneously learning agents.
no code implementations • 9 Oct 2020 • Tyler Malloy, Chris R. Sims, Tim Klinger, Miao Liu, Matthew Riemer, Gerald Tesauro
We focus on the model-free reinforcement learning (RL) setting and formalize our approach in terms of an information-theoretic constraint on the complexity of learned policies.
no code implementations • 16 Jun 2020 • Tim Klinger, Dhaval Adjodah, Vincent Marois, Josh Joseph, Matthew Riemer, Alex 'Sandy' Pentland, Murray Campbell
One difficulty in the development of such models is the lack of benchmarks with clear compositional and relational task structure on which to systematically evaluate them.
2 code implementations • 28 Apr 2020 • Cameron Allen, Michael Katz, Tim Klinger, George Konidaris, Matthew Riemer, Gerald Tesauro
Focused macros dramatically improve black-box planning efficiency across a wide range of planning domains, sometimes beating even state-of-the-art planners with access to a full domain model.
1 code implementation • 28 Jan 2020 • Modjtaba Shokrian Zini, Mohammad Pedramfar, Matthew Riemer, Ahmadreza Moradipari, Miao Liu
Coagent networks formalize the concept of arbitrary networks of stochastic agents that collaborate to take actions in a reinforcement learning environment.
Hierarchical Reinforcement Learning reinforcement-learning +1
no code implementations • 31 Dec 2019 • Matthew Riemer, Ignacio Cases, Clemens Rosenbaum, Miao Liu, Gerald Tesauro
In this work we note that while this key assumption of the policy gradient theorems of option-critic holds in the tabular case, it is always violated in practice for the deep function approximation setting.
no code implementations • 20 Nov 2019 • Akshay Dharmavaram, Matthew Riemer, Shalabh Bhatnagar
Option-critic learning is a general-purpose reinforcement learning (RL) framework that aims to address the issue of long term credit assignment by leveraging temporal abstractions.
no code implementations • 25 Sep 2019 • Tyler James Malloy, Matthew Riemer, Miao Liu, Tim Klinger, Gerald Tesauro, Chris R. Sims
We formalize this type of bounded rationality in terms of an information-theoretic constraint on the complexity of policies that agents seek to learn.
no code implementations • NAACL 2019 • Ignacio Cases, Clemens Rosenbaum, Matthew Riemer, Atticus Geiger, Tim Klinger, Alex Tamkin, Olivia Li, S Agarwal, hini, Joshua D. Greene, Dan Jurafsky, Christopher Potts, Lauri Karttunen
The model jointly optimizes the parameters of the functions and the meta-learner{'}s policy for routing inputs through those functions.
1 code implementation • 29 Apr 2019 • Clemens Rosenbaum, Ignacio Cases, Matthew Riemer, Tim Klinger
Compositionality is a key strategy for addressing combinatorial complexity and the curse of dimensionality.
no code implementations • 19 Apr 2019 • Pouya Bashivan, Martin Schrimpf, Robert Ajemian, Irina Rish, Matthew Riemer, Yuhai Tu
Most previous approaches to this problem rely on memory replay buffers which store samples from previously learned tasks, and use them to regularize the learning on new ones.
no code implementations • 7 Mar 2019 • Dong-Ki Kim, Miao Liu, Shayegan Omidshafiei, Sebastian Lopez-Cot, Matthew Riemer, Golnaz Habibi, Gerald Tesauro, Sami Mourad, Murray Campbell, Jonathan P. How
Collective learning can be greatly enhanced when agents effectively exchange knowledge with their peers.
3 code implementations • ICLR 2019 • Matthew Riemer, Ignacio Cases, Robert Ajemian, Miao Liu, Irina Rish, Yuhai Tu, Gerald Tesauro
In this work we propose a new conceptualization of the continual learning problem in terms of a temporally symmetric trade-off between transfer and interference that can be optimized by enforcing gradient alignment across examples.
no code implementations • NeurIPS 2018 • Matthew Riemer, Miao Liu, Gerald Tesauro
Building systems that autonomously create temporal abstractions from data is a key challenge in scaling learning and planning in reinforcement learning.
no code implementations • 17 Oct 2018 • Payel Das, Kahini Wadhawan, Oscar Chang, Tom Sercu, Cicero dos Santos, Matthew Riemer, Vijil Chenthamarakshan, Inkit Padhi, Aleksandra Mojsilovic
Our model learns a rich latent space of the biological peptide context by taking advantage of abundant, unlabeled peptide sequences.
no code implementations • 20 May 2018 • Shayegan Omidshafiei, Dong-Ki Kim, Miao Liu, Gerald Tesauro, Matthew Riemer, Christopher Amato, Murray Campbell, Jonathan P. How
The problem of teaching to improve agent learning has been investigated by prior works, but these approaches make assumptions that prevent application of teaching to general multiagent problems, or require domain expertise for problems they can apply to.
no code implementations • 17 Nov 2017 • Matthew Riemer, Tim Klinger, Djallel Bouneffouf, Michele Franceschini
Given the recent success of Deep Learning applied to a variety of single tasks, it is natural to consider more human-realistic settings.
no code implementations • ICLR 2018 • Clemens Rosenbaum, Tim Klinger, Matthew Riemer
A routing network is a kind of self-organizing neural network consisting of two components: a router and a set of one or more function blocks.
no code implementations • 12 Apr 2017 • Matthew Riemer, Elham Khabiri, Richard Goodwin
We demonstrate our approach on a Twitter domain sentiment analysis task with sequential knowledge transfer from four related tasks.