Search Results for author: Xinyang Geng

Found 11 papers, 4 papers with code

Conservative Objective Models for Effective Offline Model-Based Optimization

1 code implementation14 Jul 2021 Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine

Computational design problems arise in a number of settings, from synthetic biology to computer architectures.

Variable-Shot Adaptation for Incremental Meta-Learning

no code implementations1 Jan 2021 Tianhe Yu, Xinyang Geng, Chelsea Finn, Sergey Levine

Few-shot meta-learning methods consider the problem of learning new tasks from a small, fixed number of examples, by meta-learning across static data from a set of previous tasks.

Meta-Learning Zero-Shot Learning

Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization

no code implementations1 Jan 2021 Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine

To address this problem, we present Design-Bench, a benchmark suite of offline MBO tasks with a unified evaluation protocol and reference implementations of recent methods.

Variable-Shot Adaptation for Online Meta-Learning

no code implementations14 Dec 2020 Tianhe Yu, Xinyang Geng, Chelsea Finn, Sergey Levine

Few-shot meta-learning methods consider the problem of learning new tasks from a small, fixed number of examples, by meta-learning across static data from a set of previous tasks.

Meta-Learning Zero-Shot Learning

Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling

no code implementations12 Jun 2020 Russell Mendonca, Xinyang Geng, Chelsea Finn, Sergey Levine

Our method is based on a simple insight: we recognize that dynamics models can be adapted efficiently and consistently with off-policy data, more easily than policies and value functions.

Meta Reinforcement Learning

Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement

1 code implementation NeurIPS 2020 Benjamin Eysenbach, Xinyang Geng, Sergey Levine, Ruslan Salakhutdinov

In this paper, we show that hindsight relabeling is inverse RL, an observation that suggests that we can use inverse RL in tandem for RL algorithms to efficiently solve many tasks.

Consistent Meta-Reinforcement Learning via Model Identification and Experience Relabeling

no code implementations25 Sep 2019 Russell Mendonca, Xinyang Geng, Chelsea Finn, Sergey Levine

Reinforcement learning algorithms can acquire policies for complex tasks automatically, however the number of samples required to learn a diverse set of skills can be prohibitively large.

Meta Reinforcement Learning

Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery

no code implementations ICLR 2020 Kristian Hartikainen, Xinyang Geng, Tuomas Haarnoja, Sergey Levine

We show that dynamical distances can be used in a semi-supervised regime, where unsupervised interaction with the environment is used to learn the dynamical distances, while a small amount of preference supervision is used to determine the task goal, without any manually engineered reward function or goal examples.

Automatic Goal Generation for Reinforcement Learning Agents

1 code implementation ICML 2018 Carlos Florensa, David Held, Xinyang Geng, Pieter Abbeel

Instead, we propose a method that allows an agent to automatically discover the range of tasks that it is capable of performing.

Real-Time User-Guided Image Colorization with Learned Deep Priors

3 code implementations8 May 2017 Richard Zhang, Jun-Yan Zhu, Phillip Isola, Xinyang Geng, Angela S. Lin, Tianhe Yu, Alexei A. Efros

The system directly maps a grayscale image, along with sparse, local user "hints" to an output colorization with a Convolutional Neural Network (CNN).

Colorization

Deep Reinforcement Learning for Tensegrity Robot Locomotion

no code implementations28 Sep 2016 Marvin Zhang, Xinyang Geng, Jonathan Bruce, Ken Caluwaerts, Massimo Vespignani, Vytas SunSpiral, Pieter Abbeel, Sergey Levine

We evaluate our method with real-world and simulated experiments on the SUPERball tensegrity robot, showing that the learned policies generalize to changes in system parameters, unreliable sensor measurements, and variation in environmental conditions, including varied terrains and a range of different gravities.

Cannot find the paper you are looking for? You can Submit a new open access paper.