Search Results for author: Kelvin Xu

Found 21 papers, 11 papers with code

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

1 code implementation30 Nov 2023 Marwa Abdulhai, Isadora White, Charlie Snell, Charles Sun, Joey Hong, Yuexiang Zhai, Kelvin Xu, Sergey Levine

Developing such algorithms requires tasks that can gauge progress on algorithm design, provide accessible and reproducible evaluations for multi-turn interactions, and cover a range of task properties and challenges in improving reinforcement learning algorithms.

reinforcement-learning Text Generation

PaLM 2 Technical Report

1 code implementation17 May 2023 Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vlad Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, Guy Gur-Ari, Steven Hand, Hadi Hashemi, Le Hou, Joshua Howland, Andrea Hu, Jeffrey Hui, Jeremy Hurwitz, Michael Isard, Abe Ittycheriah, Matthew Jagielski, Wenhao Jia, Kathleen Kenealy, Maxim Krikun, Sneha Kudugunta, Chang Lan, Katherine Lee, Benjamin Lee, Eric Li, Music Li, Wei Li, Yaguang Li, Jian Li, Hyeontaek Lim, Hanzhao Lin, Zhongtao Liu, Frederick Liu, Marcello Maggioni, Aroma Mahendru, Joshua Maynez, Vedant Misra, Maysam Moussalem, Zachary Nado, John Nham, Eric Ni, Andrew Nystrom, Alicia Parrish, Marie Pellat, Martin Polacek, Alex Polozov, Reiner Pope, Siyuan Qiao, Emily Reif, Bryan Richter, Parker Riley, Alex Castro Ros, Aurko Roy, Brennan Saeta, Rajkumar Samuel, Renee Shelby, Ambrose Slone, Daniel Smilkov, David R. So, Daniel Sohn, Simon Tokumine, Dasha Valter, Vijay Vasudevan, Kiran Vodrahalli, Xuezhi Wang, Pidong Wang, ZiRui Wang, Tao Wang, John Wieting, Yuhuai Wu, Kelvin Xu, Yunhan Xu, Linting Xue, Pengcheng Yin, Jiahui Yu, Qiao Zhang, Steven Zheng, Ce Zheng, Weikang Zhou, Denny Zhou, Slav Petrov, Yonghui Wu

Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.

 Ranked #1 on Question Answering on TriviaQA (using extra training data)

Language Modelling Question Answering

Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance

no code implementations19 Dec 2022 Kelvin Xu, Zheyuan Hu, Ria Doshi, Aaron Rovinsky, Vikash Kumar, Abhishek Gupta, Sergey Levine

In this paper, we describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks and enable robots with complex multi-fingered hands to learn to perform them through interaction.

reinforcement-learning Reinforcement Learning (RL)

Autonomous Reinforcement Learning: Formalism and Benchmarking

2 code implementations ICLR 2022 Archit Sharma, Kelvin Xu, Nikhil Sardana, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn

In this paper, we aim to address this discrepancy by laying out a framework for Autonomous Reinforcement Learning (ARL): reinforcement learning where the agent not only learns through its own experience, but also contends with lack of human supervision to reset between trials.

Benchmarking reinforcement-learning +1

Continual Learning of Control Primitives : Skill Discovery via Reset-Games

no code implementations NeurIPS 2020 Kelvin Xu, Siddharth Verma, Chelsea Finn, Sergey Levine

First, in real world settings, when an agent attempts a tasks and fails, the environment must somehow "reset" so that the agent can attempt the task again.

Continual Learning

Continual Learning of Control Primitives: Skill Discovery via Reset-Games

1 code implementation10 Nov 2020 Kelvin Xu, Siddharth Verma, Chelsea Finn, Sergey Levine

Reinforcement learning has the potential to automate the acquisition of behavior in complex settings, but in order for it to be successfully deployed, a number of practical challenges must be addressed.

Continual Learning

Few-Shot Intent Inference via Meta-Inverse Reinforcement Learning

no code implementations ICLR 2019 Kelvin Xu, Ellis Ratner, Anca Dragan, Sergey Levine, Chelsea Finn

A significant challenge for the practical application of reinforcement learning toreal world problems is the need to specify an oracle reward function that correctly defines a task.

reinforcement-learning Reinforcement Learning (RL)

Probabilistic Model-Agnostic Meta-Learning

1 code implementation NeurIPS 2018 Chelsea Finn, Kelvin Xu, Sergey Levine

However, a critical challenge in few-shot learning is task ambiguity: even when a powerful prior can be meta-learned from a large number of prior tasks, a small dataset for a new task can simply be too ambiguous to acquire a single model (e. g., a classifier) for that task that is accurate.

Active Learning Few-Shot Image Classification +2

Learning a Prior over Intent via Meta-Inverse Reinforcement Learning

no code implementations31 May 2018 Kelvin Xu, Ellis Ratner, Anca Dragan, Sergey Levine, Chelsea Finn

A significant challenge for the practical application of reinforcement learning in the real world is the need to specify an oracle reward function that correctly defines a task.

reinforcement-learning Reinforcement Learning (RL)

Trust-PCL: An Off-Policy Trust Region Method for Continuous Control

1 code implementation ICLR 2018 Ofir Nachum, Mohammad Norouzi, Kelvin Xu, Dale Schuurmans

When evaluated on a number of continuous control tasks, Trust-PCL improves the solution quality and sample efficiency of TRPO.

Continuous Control Reinforcement Learning (RL)

Bridging the Gap Between Value and Policy Based Reinforcement Learning

1 code implementation NeurIPS 2017 Ofir Nachum, Mohammad Norouzi, Kelvin Xu, Dale Schuurmans

We establish a new connection between value and policy based reinforcement learning (RL) based on a relationship between softmax temporal value consistency and policy optimality under entropy regularization.

Q-Learning reinforcement-learning +1

Unsupervised Perceptual Rewards for Imitation Learning

no code implementations20 Dec 2016 Pierre Sermanet, Kelvin Xu, Sergey Levine

We present a method that is able to identify key intermediate steps of a task from only a handful of demonstration sequences, and automatically identify the most discriminative features for identifying these steps.

Imitation Learning Reinforcement Learning (RL)

Theano: A Python framework for fast computation of mathematical expressions

1 code implementation9 May 2016 The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang

Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements.

BIG-bench Machine Learning Clustering +2

A Controller-Recognizer Framework: How necessary is recognition for control?

no code implementations19 Nov 2015 Marcin Moczulski, Kelvin Xu, Aaron Courville, Kyunghyun Cho

Recently there has been growing interest in building active visual object recognizers, as opposed to the usual passive recognizers which classifies a given static image into a predefined set of object categories.

On Using Monolingual Corpora in Neural Machine Translation

no code implementations11 Mar 2015 Caglar Gulcehre, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Loic Barrault, Huei-Chi Lin, Fethi Bougares, Holger Schwenk, Yoshua Bengio

Recent work on end-to-end neural network-based architectures for machine translation has shown promising results for En-Fr and En-De translation.

Machine Translation Translation

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

87 code implementations10 Feb 2015 Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio

Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images.

Image Captioning Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.