Search Results for author: Gerald Tesauro

Found 30 papers, 13 papers with code

Game-Theoretical Perspectives on Active Equilibria: A Preferred Solution Concept over Nash Equilibria

no code implementations28 Oct 2022 Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Gerald Tesauro, Jonathan P. How

By directly comparing active equilibria to Nash equilibria in these examples, we find that active equilibria find more effective solutions than Nash equilibria, concluding that an active equilibrium is the desired solution for multiagent learning settings.

Influencing Long-Term Behavior in Multiagent Reinforcement Learning

1 code implementation7 Mar 2022 Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Michael Everett, Chuangchuang Sun, Gerald Tesauro, Jonathan P. How

An effective approach that has recently emerged for addressing this non-stationarity is for each agent to anticipate the learning of other agents and influence the evolution of future policies towards desirable behavior for its own benefit.

reinforcement-learning Reinforcement Learning (RL)

Context-Specific Representation Abstraction for Deep Option Learning

1 code implementation20 Sep 2021 Marwa Abdulhai, Dong-Ki Kim, Matthew Riemer, Miao Liu, Gerald Tesauro, Jonathan P. How

Hierarchical reinforcement learning has focused on discovering temporally extended actions, such as options, that can provide benefits in problems requiring extensive exploration.

Hierarchical Reinforcement Learning

Decentralized TD Tracking with Linear Function Approximation and its Finite-Time Analysis

no code implementations NeurIPS 2020 Gang Wang, Songtao Lu, Georgios Giannakis, Gerald Tesauro, Jian Sun

The present contribution deals with decentralized policy evaluation in multi-agent Markov decision processes using temporal-difference (TD) methods with linear function approximation for scalability.

Consolidation via Policy Information Regularization in Deep RL for Multi-Agent Games

no code implementations23 Nov 2020 Tyler Malloy, Tim Klinger, Miao Liu, Matthew Riemer, Gerald Tesauro, Chris R. Sims

This paper introduces an information-theoretic constraint on learned policy complexity in the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) reinforcement learning algorithm.

Continual Learning Continuous Control +2

A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning

1 code implementation31 Oct 2020 Dong-Ki Kim, Miao Liu, Matthew Riemer, Chuangchuang Sun, Marwa Abdulhai, Golnaz Habibi, Sebastian Lopez-Cot, Gerald Tesauro, Jonathan P. How

A fundamental challenge in multiagent reinforcement learning is to learn beneficial behaviors in a shared environment with other simultaneously learning agents.

reinforcement-learning Reinforcement Learning (RL)

Deep RL With Information Constrained Policies: Generalization in Continuous Control

no code implementations9 Oct 2020 Tyler Malloy, Chris R. Sims, Tim Klinger, Miao Liu, Matthew Riemer, Gerald Tesauro

We focus on the model-free reinforcement learning (RL) setting and formalize our approach in terms of an information-theoretic constraint on the complexity of learned policies.

Continuous Control reinforcement-learning +1

Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Approaches

no code implementations12 Jul 2020 Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Pushkar Shukla, Sadhana Kumaravel, Gerald Tesauro, Kartik Talamadupula, Mrinmaya Sachan, Murray Campbell

We introduce a number of RL agents that combine the sequential context with a dynamic graph representation of their beliefs of the world and commonsense knowledge from ConceptNet in different ways.

Decision Making Reinforcement Learning (RL) +1

Efficient Black-Box Planning Using Macro-Actions with Focused Effects

2 code implementations28 Apr 2020 Cameron Allen, Michael Katz, Tim Klinger, George Konidaris, Matthew Riemer, Gerald Tesauro

Focused macros dramatically improve black-box planning efficiency across a wide range of planning domains, sometimes beating even state-of-the-art planners with access to a full domain model.

On the Role of Weight Sharing During Deep Option Learning

no code implementations31 Dec 2019 Matthew Riemer, Ignacio Cases, Clemens Rosenbaum, Miao Liu, Gerald Tesauro

In this work we note that while this key assumption of the policy gradient theorems of option-critic holds in the tabular case, it is always violated in practice for the deep function approximation setting.

Atari Games

CAPACITY-LIMITED REINFORCEMENT LEARNING: APPLICATIONS IN DEEP ACTOR-CRITIC METHODS FOR CONTINUOUS CONTROL

no code implementations25 Sep 2019 Tyler James Malloy, Matthew Riemer, Miao Liu, Tim Klinger, Gerald Tesauro, Chris R. Sims

We formalize this type of bounded rationality in terms of an information-theoretic constraint on the complexity of policies that agents seek to learn.

Continuous Control reinforcement-learning +1

Hybrid Reinforcement Learning with Expert State Sequences

1 code implementation11 Mar 2019 Xiaoxiao Guo, Shiyu Chang, Mo Yu, Gerald Tesauro, Murray Campbell

The empirical results show that (1) the agents are able to leverage state expert sequences to learn faster than pure reinforcement learning baselines, (2) our tensor-based action inference model is advantageous compared to standard deep neural networks in inferring expert actions, and (3) the hybrid policy optimization objective is robust against noise in expert state sequences.

Atari Games Imitation Learning +2

Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference

3 code implementations ICLR 2019 Matthew Riemer, Ignacio Cases, Robert Ajemian, Miao Liu, Irina Rish, Yuhai Tu, Gerald Tesauro

In this work we propose a new conceptualization of the continual learning problem in terms of a temporally symmetric trade-off between transfer and interference that can be optimized by enforcing gradient alignment across examples.

Continual Learning Meta-Learning

Learning Abstract Options

no code implementations NeurIPS 2018 Matthew Riemer, Miao Liu, Gerald Tesauro

Building systems that autonomously create temporal abstractions from data is a key challenge in scaling learning and planning in reinforcement learning.

Learning to Teach in Cooperative Multiagent Reinforcement Learning

no code implementations20 May 2018 Shayegan Omidshafiei, Dong-Ki Kim, Miao Liu, Gerald Tesauro, Matthew Riemer, Christopher Amato, Murray Campbell, Jonathan P. How

The problem of teaching to improve agent learning has been investigated by prior works, but these approaches make assumptions that prevent application of teaching to general multiagent problems, or require domain expertise for problems they can apply to.

reinforcement-learning Reinforcement Learning (RL)

Dialog-based Interactive Image Retrieval

1 code implementation NeurIPS 2018 Xiaoxiao Guo, Hui Wu, Yu Cheng, Steven Rennie, Gerald Tesauro, Rogerio Schmidt Feris

Experiments on both simulated and real-world data show that 1) our proposed learning framework achieves better accuracy than other supervised and reinforcement learning baselines and 2) user feedback based on natural language rather than pre-specified attributes leads to more effective retrieval results, and a more natural and expressive communication interface.

Image Retrieval reinforcement-learning +3

Faster Reinforcement Learning with Expert State Sequences

no code implementations ICLR 2018 Xiaoxiao Guo, Shiyu Chang, Mo Yu, Miao Liu, Gerald Tesauro

In this paper, we consider a realistic and more difficult sce- nario where a reinforcement learning agent only has access to the state sequences of an expert, while the expert actions are not available.

Imitation Learning reinforcement-learning +1

The Eigenoption-Critic Framework

no code implementations11 Dec 2017 Miao Liu, Marlos C. Machado, Gerald Tesauro, Murray Campbell

Eigenoptions (EOs) have been recently introduced as a promising idea for generating a diverse set of options through the graph Laplacian, having been shown to allow efficient exploration.

Efficient Exploration Hierarchical Reinforcement Learning +1

Eigenoption Discovery through the Deep Successor Representation

1 code implementation ICLR 2018 Marlos C. Machado, Clemens Rosenbaum, Xiaoxiao Guo, Miao Liu, Gerald Tesauro, Murray Campbell

Options in reinforcement learning allow agents to hierarchically decompose a task into subtasks, having the potential to speed up learning and planning.

Atari Games reinforcement-learning +2

Robust Task Clustering for Deep Many-Task Learning

no code implementations26 Aug 2017 Mo Yu, Xiaoxiao Guo, Jin-Feng Yi, Shiyu Chang, Saloni Potdar, Gerald Tesauro, Haoyu Wang, Bo-Wen Zhou

We propose a new method to measure task similarities with cross-task transfer performance matrix for the deep learning scenario.

Clustering Few-Shot Learning +7

Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation

4 code implementations2 Jun 2016 Iulian Vlad Serban, Tim Klinger, Gerald Tesauro, Kartik Talamadupula, Bo-Wen Zhou, Yoshua Bengio, Aaron Courville

We introduce the multiresolution recurrent neural network, which extends the sequence-to-sequence framework to model natural language generation as two parallel discrete stochastic processes: a sequence of high-level coarse tokens, and a sequence of natural language tokens.

Dialogue Generation Response Generation

Hierarchical Memory Networks

no code implementations24 May 2016 Sarath Chandar, Sungjin Ahn, Hugo Larochelle, Pascal Vincent, Gerald Tesauro, Yoshua Bengio

In this paper, we explore a form of hierarchical memory network, which can be considered as a hybrid between hard and soft attention memory networks.

Hard Attention Question Answering

Selecting Near-Optimal Learners via Incremental Data Allocation

no code implementations31 Dec 2015 Ashish Sabharwal, Horst Samulowitz, Gerald Tesauro

We study a novel machine learning (ML) problem setting of sequentially allocating small subsets of training data amongst a large set of classifiers.

Analysis of Watson's Strategies for Playing Jeopardy!

no code implementations4 Feb 2014 Gerald Tesauro, David C. Gondek, Jonathan Lenchner, James Fan, John M. Prager

After giving a detailed description of each of our game-strategy algorithms, we then focus in particular on validating the accuracy of the simulators predictions, and documenting performance improvements using our methods.

Decision Making General Knowledge +1

Cannot find the paper you are looking for? You can Submit a new open access paper.