Search Results for author: Roy Fox

Found 31 papers, 10 papers with code

A multi-agent control framework for co-adaptation in brain-computer interfaces

no code implementations • NeurIPS 2013 • Josh S. Merel, Roy Fox, Tony Jebara, Liam Paninski

In a closed-loop brain-computer interface (BCI), adaptive decoders are used to learn parameters suited to decoding the user's neural response.

Brain Computer Interface

Paper
Add Code

Taming the Noise in Reinforcement Learning via Soft Updates

3 code implementations • 28 Dec 2015 • Roy Fox, Ari Pakman, Naftali Tishby

We propose G-learning, a new off-policy learning algorithm that regularizes the value estimates by penalizing deterministic policies in the beginning of the learning process.

Q-Learning reinforcement-learning +1

Paper
Code

Optimal Selective Attention in Reactive Agents

no code implementations • 29 Dec 2015 • Roy Fox, Naftali Tishby

One attempt to deal with this is to focus on reactive policies, that only base their actions on the most recent observation.

Paper
Add Code

Principled Option Learning in Markov Decision Processes

no code implementations • 18 Sep 2016 • Roy Fox, Michal Moshkovitz, Naftali Tishby

It is well known that options can make planning more efficient, among their many benefits.

Paper
Add Code

Information-Theoretic Methods for Planning and Learning in Partially Observable Markov Decision Processes

no code implementations • 24 Sep 2016 • Roy Fox

Bounded agents are limited by intrinsic constraints on their ability to process information that is available in their sensors and memory and choose actions and memory updates.

Paper
Add Code

Multi-Level Discovery of Deep Options

no code implementations • 24 Mar 2017 • Roy Fox, Sanjay Krishnan, Ion Stoica, Ken Goldberg

Augmenting an agent's control with useful higher-level behaviors called options can greatly reduce the sample complexity of reinforcement learning, but manually designing options is infeasible in high-dimensional and abstract state spaces.

Paper
Add Code

DART: Noise Injection for Robust Imitation Learning

2 code implementations • 27 Mar 2017 • Michael Laskey, Jonathan Lee, Roy Fox, Anca Dragan, Ken Goldberg

One approach to Imitation Learning is Behavior Cloning, in which a robot observes a supervisor and infers a control policy.

Imitation Learning

Paper
Code

Fast and Reliable Autonomous Surgical Debridement with Cable-Driven Robots Using a Two-Phase Calibration Procedure

1 code implementation • 19 Sep 2017 • Daniel Seita, Sanjay Krishnan, Roy Fox, Stephen McKinley, John Canny, Ken Goldberg

In Phase II (fine), the bias from Phase I is applied to move the end-effector toward a small set of specific target points on a printed sheet.

Robotics

Paper
Code

RLlib: Abstractions for Distributed Reinforcement Learning

3 code implementations • ICML 2018 • Eric Liang, Richard Liaw, Philipp Moritz, Robert Nishihara, Roy Fox, Ken Goldberg, Joseph E. Gonzalez, Michael. I. Jordan, Ion Stoica

Reinforcement learning (RL) algorithms involve the deep nesting of highly irregular computation patterns, each of which typically exhibits opportunities for distributed computation.

reinforcement-learning Reinforcement Learning (RL)

30,945

Paper
Code

Parametrized Hierarchical Procedures for Neural Programming

no code implementations • ICLR 2018 • Roy Fox, Richard Shin, Sanjay Krishnan, Ken Goldberg, Dawn Song, Ion Stoica

Neural programs are highly accurate and structured policies that perform algorithmic tasks by controlling the behavior of a computation mechanism.

Imitation Learning

Paper
Add Code

Hierarchical Variational Imitation Learning of Control Programs

1 code implementation • 29 Dec 2019 • Roy Fox, Richard Shin, William Paul, Yitian Zou, Dawn Song, Ken Goldberg, Pieter Abbeel, Ion Stoica

Autonomous agents can learn by imitating teacher demonstrations of the intended behavior.

Imitation Learning Variational Inference

Paper
Code

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

2 code implementations • NeurIPS 2020 • Stephen McAleer, John Lanier, Roy Fox, Pierre Baldi

We also introduce an open-source environment for Barrage Stratego, a variant of Stratego with an approximate game tree complexity of $10^{50}$.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks

no code implementations • 8 Feb 2021 • Forest Agostinelli, Alexander Shmakov, Stephen Mcaleer, Roy Fox, Pierre Baldi

We use Q* search to solve the Rubik's cube when formulated with a large action space that includes 1872 meta-actions and find that this 157-fold increase in the size of the action space incurs less than a 4-fold increase in computation time and less than a 3-fold increase in number of nodes generated when performing Q* search.

Rubik's Cube

Paper
Add Code

XDO: A Double Oracle Algorithm for Extensive-Form Games

1 code implementation • NeurIPS 2021 • Stephen Mcaleer, John Lanier, Kevin Wang, Pierre Baldi, Roy Fox

NXDO is the first deep RL method that can find an approximate Nash equilibrium in high-dimensional continuous-action sequential games.

Reinforcement Learning (RL)

Paper
Code

Improving Social Welfare While Preserving Autonomy via a Pareto Mediator

no code implementations • 7 Jun 2021 • Stephen Mcaleer, John Lanier, Michael Dennis, Pierre Baldi, Roy Fox

Machine learning algorithms often make decisions on behalf of agents with varied and sometimes conflicting interests.

Open-Ended Question Answering

Paper
Add Code

Modular Framework for Visuomotor Language Grounding

no code implementations • 5 Sep 2021 • Kolby Nottingham, Litian Liang, Daeyun Shin, Charless C. Fowlkes, Roy Fox, Sameer Singh

Natural language instruction following tasks serve as a valuable test-bed for grounded language and robotics research.

Instruction Following

Paper
Add Code

Independent Natural Policy Gradient Always Converges in Markov Potential Games

no code implementations • 20 Oct 2021 • Roy Fox, Stephen Mcaleer, Will Overman, Ioannis Panageas

Recent results have shown that independent policy gradient converges in MPGs but it was not known whether Independent Natural Policy Gradient converges in MPGs as well.

Multi-agent Reinforcement Learning

Paper
Add Code

Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates

no code implementations • 28 Oct 2021 • Litian Liang, Yaosheng Xu, Stephen Mcaleer, Dailin Hu, Alexander Ihler, Pieter Abbeel, Roy Fox

Under the belief that $\beta$ is closely related to the (state dependent) model uncertainty, Entropy Regularized Q-Learning (EQL) further introduces a principled scheduling of $\beta$ by maintaining a collection of the model parameters that characterizes model uncertainty.

Q-Learning Scheduling

Paper
Add Code

Count-Based Temperature Scheduling for Maximum Entropy Reinforcement Learning

no code implementations • 28 Nov 2021 • Dailin Hu, Pieter Abbeel, Roy Fox

Maximum Entropy Reinforcement Learning (MaxEnt RL) algorithms such as Soft Q-Learning (SQL) and Soft Actor-Critic trade off reward and policy entropy, which has the potential to improve training stability and robustness.

Q-Learning reinforcement-learning +2

Paper
Add Code

Target Entropy Annealing for Discrete Soft Actor-Critic

no code implementations • 6 Dec 2021 • Yaosheng Xu, Dailin Hu, Litian Liang, Stephen Mcaleer, Pieter Abbeel, Roy Fox

Soft Actor-Critic (SAC) is considered the state-of-the-art algorithm in continuous action space settings.

Atari Games Scheduling

Paper
Add Code

Anytime PSRO for Two-Player Zero-Sum Games

no code implementations • 19 Jan 2022 • Stephen Mcaleer, Kevin Wang, John Lanier, Marc Lanctot, Pierre Baldi, Tuomas Sandholm, Roy Fox

PSRO is based on the tabular double oracle (DO) method, an algorithm that is guaranteed to converge to a Nash equilibrium, but may increase exploitability from one iteration to the next.

Multi-agent Reinforcement Learning reinforcement-learning +2

Paper
Add Code

Learning to Query Internet Text for Informing Reinforcement Learning Agents

1 code implementation • 25 May 2022 • Kolby Nottingham, Alekhya Pyla, Sameer Singh, Roy Fox

We show that our method correctly learns to execute queries to maximize reward in a reinforcement learning setting.

reinforcement-learning Reinforcement Learning (RL)

447

Paper
Code

Self-Play PSRO: Toward Optimal Populations in Two-Player Zero-Sum Games

no code implementations • 13 Jul 2022 • Stephen Mcaleer, JB Lanier, Kevin Wang, Pierre Baldi, Roy Fox, Tuomas Sandholm

Instead of adding only deterministic best responses to the opponent's least exploitable population mixture, SP-PSRO also learns an approximately optimal stochastic policy and adds it to the population as well.

Reinforcement Learning (RL)

Paper
Add Code

Feasible Adversarial Robust Reinforcement Learning for Underspecified Environments

no code implementations • 19 Jul 2022 • JB Lanier, Stephen Mcaleer, Pierre Baldi, Roy Fox

In this paper, we propose Feasible Adversarial Robust RL (FARR), a novel problem formulation and objective for automatically determining the set of environment parameter values over which to be robust.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks

1 code implementation • 16 Sep 2022 • Litian Liang, Yaosheng Xu, Stephen Mcaleer, Dailin Hu, Alexander Ihler, Pieter Abbeel, Roy Fox

On a set of 26 benchmark Atari environments, MeanQ outperforms all tested baselines, including the best available baseline, SUNRISE, at 100K interaction steps in 16/26 environments, and by 68% on average.

Paper
Code

Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling

no code implementations • 28 Jan 2023 • Kolby Nottingham, Prithviraj Ammanabrolu, Alane Suhr, Yejin Choi, Hannaneh Hajishirzi, Sameer Singh, Roy Fox

Reinforcement learning (RL) agents typically learn tabula rasa, without prior knowledge of the world.

Decision Making Reinforcement Learning (RL)

Paper
Add Code

Selective Perception: Optimizing State Descriptions with Reinforcement Learning for Language Model Actors

no code implementations • 21 Jul 2023 • Kolby Nottingham, Yasaman Razeghi, KyungMin Kim, JB Lanier, Pierre Baldi, Roy Fox, Sameer Singh

Large language models (LLMs) are being applied as actors for sequential decision making tasks in domains such as robotics and games, utilizing their general world knowledge and planning abilities.

Decision Making Language Modelling +2

Paper
Add Code

Learning to Design Analog Circuits to Meet Threshold Specifications

1 code implementation • 25 Jul 2023 • Dmitrii Krylov, Pooya Khajeh, Junhan Ouyang, Thomas Reeves, Tongkai Liu, Hiba Ajmal, Hamidreza Aghasi, Roy Fox

In this work, we propose a method for generating from simulation data a dataset on which a system can be trained via supervised learning to design circuits to meet threshold specifications.

Paper
Code

Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills

no code implementations • 5 Feb 2024 • Kolby Nottingham, Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Sameer Singh, Peter Clark, Roy Fox

We evaluate our method in the classic videogame NetHack and the text environment ScienceWorld to demonstrate SSO's ability to optimize a set of skills and perform in-context policy improvement.

Decision Making Language Modelling +1

Paper
Add Code

Moonwalk: Inverse-Forward Differentiation

no code implementations • 22 Feb 2024 • Dmitrii Krylov, Armin Karamzade, Roy Fox

Our method, Moonwalk, has a time complexity linear in the depth of the network, unlike the quadratic time complexity of na\"ive forward, and empirically reduces computation time by several orders of magnitude without allocating more memory.

Paper
Add Code

Reinforcement Learning from Delayed Observations via World Models

no code implementations • 18 Mar 2024 • Armin Karamzade, KyungMin Kim, Montek Kalsi, Roy Fox

In standard Reinforcement Learning settings, agents typically assume immediate feedback about the effects of their actions after taking them.

reinforcement-learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.