no code implementations • 11 Oct 2024 • Justin Wong, Yury Orlovskiy, Michael Luo, Sanjit A. Seshia, Joseph E. Gonzalez
As computing probability per response/solution for proprietary models is infeasible, we measure recall on ground truth solutions.
no code implementations • 29 Apr 2024 • Michael Luo, Justin Wong, Brandon Trabucco, Yanping Huang, Joseph E. Gonzalez, Zhifeng Chen, Ruslan Salakhutdinov, Ion Stoica
This paper explores the problem of matching the prompt to a set of relevant adapters, built on recent work that highlight the performance gains of composing adapters.
1 code implementation • 5 Jan 2022 • Zongheng Yang, Wei-Lin Chiang, Sifei Luan, Gautam Mittal, Michael Luo, Ion Stoica
Query optimizers are a performance-critical component in every database system.
no code implementations • 7 Dec 2021 • Michael Luo, Ashwin Balakrishna, Brijen Thananjeyan, Suraj Nair, Julian Ibarz, Jie Tan, Chelsea Finn, Ion Stoica, Ken Goldberg
Safe exploration is critical for using reinforcement learning (RL) in risk-sensitive environments.
1 code implementation • 27 Oct 2021 • Xuanlin Li, Brandon Trabucco, Dong Huk Park, Michael Luo, Sheng Shen, Trevor Darrell, Yang Gao
Permutations then serve as target generation orders for training an insertion-based Transformer language model.
1 code implementation • NeurIPS 2021 • Jeffrey Ichnowski, Paras Jain, Bartolomeo Stellato, Goran Banjac, Michael Luo, Francesco Borrelli, Joseph E. Gonzalez, Ion Stoica, Ken Goldberg
First-order methods for quadratic optimization such as OSQP are widely used for large-scale machine learning and embedded optimal control, where many related problems must be rapidly solved.
no code implementations • 31 Mar 2021 • Ryan Hoque, Ashwin Balakrishna, Carl Putterman, Michael Luo, Daniel S. Brown, Daniel Seita, Brijen Thananjeyan, Ellen Novoseller, Ken Goldberg
Corrective interventions while a robot is learning to automate a task provide an intuitive method for a human supervisor to assist the robot and convey information about desired behavior.
1 code implementation • ICLR 2021 • Xuanlin Li, Brandon Trabucco, Dong Huk Park, Michael Luo, Sheng Shen, Trevor Darrell, Yang Gao
One strategy to recover this information is to decode both the content and location of tokens.
no code implementations • 27 Nov 2020 • Rachit Dubey, Erin Grant, Michael Luo, Karthik Narasimhan, Thomas Griffiths
This work connects the context-sensitive nature of cognitive control to a method for meta-learning with context-conditioned adaptation.
1 code implementation • NeurIPS 2021 • Eric Liang, Zhanghao Wu, Michael Luo, Sven Mika, Joseph E. Gonzalez, Ion Stoica
Researchers and practitioners in the field of reinforcement learning (RL) frequently leverage parallel computation, which has led to a plethora of new algorithms and systems in the last few years.
2 code implementations • 29 Oct 2020 • Brijen Thananjeyan, Ashwin Balakrishna, Suraj Nair, Michael Luo, Krishnan Srinivasan, Minho Hwang, Joseph E. Gonzalez, Julian Ibarz, Chelsea Finn, Ken Goldberg
Safety remains a central obstacle preventing widespread use of RL in the real world: learning new tasks in uncertain environments requires extensive exploration, but safety requires limiting exploration.
no code implementations • ICLR 2020 • Michael Luo, Jiahao Yao, Richard Liaw, Eric Liang, Ion Stoica
To address this, we propose a new distributed reinforcement learning algorithm, IMPACT.