no code implementations • 17 Apr 2024 • Rishabh Agarwal, Avi Singh, Lei M. Zhang, Bernd Bohnet, Stephanie Chan, Ankesh Anand, Zaheer Abbas, Azade Nova, John D. Co-Reyes, Eric Chu, Feryal Behbahani, Aleksandra Faust, Hugo Larochelle
Finally, we demonstrate that, unlike few-shot learning, many-shot learning is effective at overriding pretraining biases and can learn high-dimensional functions with numerical inputs.
no code implementations • 6 Mar 2024 • Jesse Farebrother, Jordi Orbay, Quan Vuong, Adrien Ali Taïga, Yevgen Chebotar, Ted Xiao, Alex Irpan, Sergey Levine, Pablo Samuel Castro, Aleksandra Faust, Aviral Kumar, Rishabh Agarwal
Observing this discrepancy, in this paper, we investigate whether the scalability of deep RL can also be improved simply by using classification in place of regression for training value functions.
no code implementations • 8 Feb 2024 • John D. Co-Reyes, Yingjie Miao, George Tucker, Aleksandra Faust, Esteban Real
How to automatically design better machine learning programs is an open problem within AutoML.
1 code implementation • 30 Nov 2023 • Hiroki Furuta, Yutaka Matsuo, Aleksandra Faust, Izzeddin Gur
We show that while existing prompted LMAs (gpt-3. 5-turbo or gpt-4) achieve 94. 0% average success rate on base tasks, their performance degrades to 24. 9% success rate on compositional tasks.
no code implementations • 4 Nov 2023 • Meredith Ringel Morris, Jascha Sohl-Dickstein, Noah Fiedel, Tris Warkentin, Allan Dafoe, Aleksandra Faust, Clement Farabet, Shane Legg
With these principles in mind, we propose 'Levels of AGI' based on depth (performance) and breadth (generality) of capabilities, and reflect on how current systems fit into this ontology.
no code implementations • 24 Jul 2023 • Izzeddin Gur, Hiroki Furuta, Austin Huang, Mustafa Safdari, Yutaka Matsuo, Douglas Eck, Aleksandra Faust
Pre-trained large language models (LLMs) have recently achieved better generalization and sample efficiency in autonomous web automation.
Ranked #1 on on Mind2Web
1 code implementation • 1 Jul 2023 • Greg Serapio-García, Mustafa Safdari, Clément Crepy, Luning Sun, Stephen Fitz, Peter Romero, Marwa Abdulhai, Aleksandra Faust, Maja Matarić
The advent of large language models (LLMs) has revolutionized natural language processing, enabling the generation of coherent and contextually relevant human-like text.
no code implementations • 15 Jun 2023 • Srivatsan Krishnan, Amir Yazdanbaksh, Shvetank Prakash, Jason Jabbour, Ikechukwu Uchendu, Susobhan Ghosh, Behzad Boroujerdian, Daniel Richins, Devashree Tripathy, Aleksandra Faust, Vijay Janapa Reddi
The ease of data collection and aggregation in ArchGym facilitates research in ML-aided architecture design space exploration.
no code implementations • 19 May 2023 • Hiroki Furuta, Kuang-Huei Lee, Ofir Nachum, Yutaka Matsuo, Aleksandra Faust, Shixiang Shane Gu, Izzeddin Gur
The progress of autonomous web navigation has been hindered by the dependence on billions of exploratory interactions via online reinforcement learning, and domain-specific model designs that make it difficult to leverage generalization from rich out-of-domain data.
no code implementations • 2 Apr 2023 • Anish Muthali, Haotian Shen, Sampada Deglurkar, Michael H. Lim, Rebecca Roelofs, Aleksandra Faust, Claire Tomlin
We investigate methods to provide safety assurances for autonomous agents that incorporate predictions of other, uncontrolled agents' behavior into their own trajectory planning.
no code implementations • 21 Dec 2022 • Yiren Lu, Justin Fu, George Tucker, Xinlei Pan, Eli Bronstein, Rebecca Roelofs, Benjamin Sapp, Brandyn White, Aleksandra Faust, Shimon Whiteson, Dragomir Anguelov, Sergey Levine
To our knowledge, this is the first application of a combined imitation and reinforcement learning approach in autonomous driving that utilizes large amounts of real-world human driving data.
no code implementations • 29 Nov 2022 • Srivatsan Krishnan, Natasha Jaques, Shayegan Omidshafiei, Dan Zhang, Izzeddin Gur, Vijay Janapa Reddi, Aleksandra Faust
It is unclear how scalable single-agent formulations are as we increase the complexity of the design space (e. g., full stack System-on-Chip design).
1 code implementation • 19 Oct 2022 • Abdus Salam Azad, Izzeddin Gur, Jasper Emhoff, Nathaniel Alexis, Aleksandra Faust, Pieter Abbeel, Ion Stoica
Recently, Unsupervised Environment Design (UED) emerged as a new paradigm for zero-shot generalization by simultaneously learning a task distribution and agent policies on the generated tasks.
no code implementations • 8 Oct 2022 • Izzeddin Gur, Ofir Nachum, Yingjie Miao, Mustafa Safdari, Austin Huang, Aakanksha Chowdhery, Sharan Narang, Noah Fiedel, Aleksandra Faust
We contribute HTML understanding models (fine-tuned LLMs) and an in-depth analysis of their capabilities under three tasks: (i) Semantic Classification of HTML elements, (ii) Description Generation for HTML inputs, and (iii) Autonomous Web Navigation of HTML pages.
no code implementations • 25 May 2022 • Sungryull Sohn, Hyunjae Woo, Jongwook Choi, lyubing qiang, Izzeddin Gur, Aleksandra Faust, Honglak Lee
Different from the previous meta-rl methods trying to directly infer the unstructured task embedding, our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks, and use it as a prior to improve the task inference in testing.
Hierarchical Reinforcement Learning Meta Reinforcement Learning +2
no code implementations • 11 May 2022 • Sabrina M. Neuman, Brian Plancher, Bardienus P. Duisterhof, Srivatsan Krishnan, Colby Banbury, Mark Mazumder, Shvetank Prakash, Jason Jabbour, Aleksandra Faust, Guido C. H. E. de Croon, Vijay Janapa Reddi
Machine learning (ML) has become a pervasive tool across computing systems.
no code implementations • 8 Apr 2022 • Juan Jose Garau-Luis, Yingjie Miao, John D. Co-Reyes, Aaron Parisi, Jie Tan, Esteban Real, Aleksandra Faust
Generalizability and stability are two key objectives for operating reinforcement learning (RL) agents in the real world.
1 code implementation • NeurIPS 2021 • Izzeddin Gur, Natasha Jaques, Yingjie Miao, Jongwook Choi, Manoj Tiwari, Honglak Lee, Aleksandra Faust
We learn to generate environments composed of multiple pages or rooms, and train RL agents capable of completing wide-range of complex tasks in those environments.
no code implementations • 11 Jan 2022 • Jack Parker-Holder, Raghu Rajan, Xingyou Song, André Biedenkapp, Yingjie Miao, Theresa Eimer, Baohe Zhang, Vu Nguyen, Roberto Calandra, Aleksandra Faust, Frank Hutter, Marius Lindauer
The combination of Reinforcement Learning (RL) with deep learning has led to a series of impressive feats, with many believing (deep) RL provides a path towards generally capable agents.
1 code implementation • 17 Dec 2021 • Sampada Deglurkar, Michael H. Lim, Johnathan Tucker, Zachary N. Sunberg, Aleksandra Faust, Claire J. Tomlin
The Partially Observable Markov Decision Process (POMDP) is a powerful framework for capturing decision-making problems that involve state and transition uncertainty.
no code implementations • CVPR 2022 • Su Wang, Ceslee Montgomery, Jordi Orbay, Vighnesh Birodkar, Aleksandra Faust, Izzeddin Gur, Natasha Jaques, Austin Waters, Jason Baldridge, Peter Anderson
We study the automatic generation of navigation instructions from 360-degree images captured on indoor routes.
no code implementations • 29 Sep 2021 • Izzeddin Gur, Ofir Nachum, Aleksandra Faust
We formalize our approach as offline targeted environment design(OTED), which automatically learns a distribution over simulator parameters to match a provided offline dataset, and then uses the learned simulator to train an RL agent in standard online fashion.
no code implementations • 15 Sep 2021 • Michael H. Lim, Andy Zeng, Brian Ichter, Maryam Bandari, Erwin Coumans, Claire Tomlin, Stefan Schaal, Aleksandra Faust
Enabling robots to solve multiple manipulation tasks has a wide range of industrial applications.
no code implementations • ICML Workshop URL 2021 • Alberto Camacho, Izzeddin Gur, Marcin Lukasz Moczulski, Ofir Nachum, Aleksandra Faust
We are concerned with a setting where the demonstrations comprise only a subset of state-action pairs (as opposed to the whole trajectories).
1 code implementation • 4 Jun 2021 • Yingjie Miao, Xingyou Song, John D. Co-Reyes, Daiyi Peng, Summer Yue, Eugene Brevdo, Aleksandra Faust
In this paper, we investigate the fundamental question: To what extent are gradient-based neural architecture search (NAS) techniques applicable to RL?
no code implementations • 15 Apr 2021 • Dennis Lee, Natasha Jaques, Chase Kew, Jiaxing Wu, Douglas Eck, Dale Schuurmans, Aleksandra Faust
We then train agents to minimize the difference between the attention weights that they apply to the environment at each timestep, and the attention of other agents.
1 code implementation • 2 Mar 2021 • Izzeddin Gur, Natasha Jaques, Kevin Malta, Manoj Tiwari, Honglak Lee, Aleksandra Faust
The regret objective trains the adversary to design a curriculum of environments that are "just-the-right-challenge" for the navigator agents; our results show that over time, the adversary learns to generate increasingly complex web navigation tasks.
no code implementations • 5 Feb 2021 • Srivatsan Krishnan, Zishen Wan, Kshitij Bhardwaj, Paul Whatmough, Aleksandra Faust, Sabrina Neuman, Gu-Yeon Wei, David Brooks, Vijay Janapa Reddi
Balancing a computing system for a UAV requires considering both the cyber (e. g., sensor rate, compute performance) and physical (e. g., payload weight) characteristics that affect overall performance.
5 code implementations • ICLR 2021 • John D. Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Sergey Levine, Quoc V. Le, Honglak Lee, Aleksandra Faust
Learning from scratch on simple classical control and gridworld tasks, our method rediscovers the temporal-difference (TD) algorithm.
1 code implementation • 20 Mar 2020 • Varun Tolani, Somil Bansal, Aleksandra Faust, Claire Tomlin
Videos describing our approach and experiments, as well as a demo of HumANav are available on the project website.
no code implementations • 15 Mar 2020 • Rose E. Wang, J. Chase Kew, Dennis Lee, Tsang-Wei Edward Lee, Tingnan Zhang, Brian Ichter, Jie Tan, Aleksandra Faust
We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 14 Oct 2019 • J. Chase Kew, Brian Ichter, Maryam Bandari, Tsang-Wei Edward Lee, Aleksandra Faust
We present a neural network collision checking heuristic, ClearanceNet, and a planning algorithm, CN-RRT.
no code implementations • 8 Oct 2019 • Brian Ichter, Edward Schmerling, Tsang-Wei Edward Lee, Aleksandra Faust
Critical PRMs are demonstrated to achieve up to three orders of magnitude improvement over uniform sampling, while preserving the guarantees and complexity of sampling-based motion planning.
1 code implementation • 2 Oct 2019 • Srivatsan Krishnan, Maximilian Lam, Sharad Chitlangia, Zishen Wan, Gabriel Barth-Maron, Aleksandra Faust, Vijay Janapa Reddi
We believe that this is the first of many future works on enabling computationally energy-efficient and sustainable reinforcement learning.
no code implementations • 27 Sep 2019 • Xinlei Pan, Tingnan Zhang, Brian Ichter, Aleksandra Faust, Jie Tan, Sehoon Ha
Here, we propose a zero-shot imitation learning approach for training a visual navigation policy on legged robots from human (third-person perspective) demonstrations, enabling high-quality navigation and cost-effective data collection.
1 code implementation • 25 Sep 2019 • Bardienus P. Duisterhof, Srivatsan Krishnan, Jonathan J. Cruz, Colby R. Banbury, William Fu, Aleksandra Faust, Guido C. H. E. de Croon, Vijay Janapa Reddi
We present fully autonomous source seeking onboard a highly constrained nano quadcopter, by contributing application-specific system and observation feature design to enable inference of a deep-RL policy onboard a nano quadcopter.
no code implementations • 25 Sep 2019 • Yinlam Chow, Ofir Nachum, Aleksandra Faust, Edgar Duenez-Guzman, Mohammad Ghavamzadeh
We study continuous action reinforcement learning problems in which it is crucial that the agent interacts with the environment only through safe policies, i. e.,~policies that keep the agent in desirable situations, both during training and at convergence.
no code implementations • 10 Jul 2019 • Hao-Tien Lewis Chiang, Jasmine Hsu, Marek Fiser, Lydia Tapia, Aleksandra Faust
Through the combination of sampling-based planning, a Rapidly Exploring Randomized Tree (RRT) and an efficient kinodynamic motion planner through machine learning, we propose an efficient solution to long-range planning for kinodynamic motion planning.
1 code implementation • 2 Jun 2019 • Srivatsan Krishnan, Behzad Boroujerdian, William Fu, Aleksandra Faust, Vijay Janapa Reddi
We find that the trajectories on an embedded Ras-Pi are vastly different from those predicted on a high-end desktop system, resulting in up to 40% longer trajectories in one of the environments.
no code implementations • 18 May 2019 • Aleksandra Faust, Anthony Francis, Dar Mehta
Many continuous control tasks have easily formulated objectives, yet using them directly as a reward in reinforcement learning (RL) leads to suboptimal policies.
no code implementations • 25 Feb 2019 • Anthony Francis, Aleksandra Faust, Hao-Tien Lewis Chiang, Jasmine Hsu, J. Chase Kew, Marek Fiser, Tsang-Wei Edward Lee
Long-range indoor navigation requires guiding robots with noisy sensors and controls through cluttered environments along paths that span a variety of buildings.
1 code implementation • 28 Jan 2019 • Yin-Lam Chow, Ofir Nachum, Aleksandra Faust, Edgar Duenez-Guzman, Mohammad Ghavamzadeh
We formulate these problems as constrained Markov decision processes (CMDPs) and present safe policy optimization algorithms that are based on a Lyapunov approach to solve them.
no code implementations • ICLR 2019 • Izzeddin Gur, Ulrich Rueckert, Aleksandra Faust, Dilek Hakkani-Tur
Even though recent approaches improve the success rate on relatively simple environments with the help of human demonstrations to guide the exploration, they still fail in environments where the set of possible instructions can reach millions.
no code implementations • 26 Sep 2018 • Hao-Tien Lewis Chiang, Aleksandra Faust, Marek Fiser, Anthony Francis
The policies are trained in small, static environments with AutoRL, an evolutionary automation layer around Reinforcement Learning (RL) that searches for a deep RL reward and neural network architecture with large-scale hyper-parameter optimization.
no code implementations • 25 Sep 2018 • Aleksandra Faust, James B. Aimone, Conrad D. James, Lydia Tapia
Robots and autonomous agents often complete goal-based tasks with limited resources, relying on imperfect models and sensor measurements.
no code implementations • 30 May 2018 • Lazar Supic, Rawan Naous, Ranko Sredojevic, Aleksandra Faust, Vladimir Stojanovic
Deep neural networks (DNNs) have become the state-of-the-art technique for machine learning tasks in various applications.
no code implementations • 29 May 2018 • Hao-Tien Lewis Chiang, Aleksandra Faust, Lydia Tapia
Swept Volume (SV), the volume displaced by an object when it is moving along a trajectory, is considered a useful metric for motion planning.
no code implementations • 11 Oct 2017 • Aleksandra Faust, Oscar Ramirez, Marek Fiser, Kenneth Oslund, Anthony Francis, James Davidson, Lydia Tapia
The RL agents learn short-range, point-to-point navigation policies that capture robot dynamics and task constraints without knowledge of the large-scale topology.