Search Results for author: Mengjiao Yang

Found 14 papers, 10 papers with code

Chain of Thought Imitation with Procedure Cloning

1 code implementation22 May 2022 Mengjiao Yang, Dale Schuurmans, Pieter Abbeel, Ofir Nachum

Imitation learning aims to extract high-performance policies from logged demonstrations of expert behavior.

Imitation Learning

CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning

no code implementations18 Apr 2022 Siddharth Verma, Justin Fu, Mengjiao Yang, Sergey Levine

Conventionally, generation of natural language for dialogue agents may be viewed as a statistical learning problem: determine the patterns in human-provided data and generate appropriate responses with similar statistical properties.

Chatbot Offline RL +1

TRAIL: Near-Optimal Imitation Learning with Suboptimal Data

1 code implementation ICLR 2022 Mengjiao Yang, Sergey Levine, Ofir Nachum

In this work, we answer this question affirmatively and present training objectives that use offline datasets to learn a factored transition model whose structure enables the extraction of a latent action space.

Imitation Learning

Combiner: Full Attention Transformer with Sparse Computation Cost

1 code implementation NeurIPS 2021 Hongyu Ren, Hanjun Dai, Zihang Dai, Mengjiao Yang, Jure Leskovec, Dale Schuurmans, Bo Dai

However, the key limitation of transformers is their quadratic memory and time complexity $\mathcal{O}(L^2)$ with respect to the sequence length in attention layers, which restricts application in extremely long sequences.

Image Generation Language Modelling

Provable Representation Learning for Imitation with Contrastive Fourier Features

1 code implementation NeurIPS 2021 Ofir Nachum, Mengjiao Yang

In imitation learning, it is common to learn a behavior policy to match an unknown target policy via max-likelihood training on a collected set of target demonstrations.

Atari Games Contrastive Learning +2

Benchmarks for Deep Off-Policy Evaluation

3 code implementations ICLR 2021 Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine

Off-policy evaluation (OPE) holds the promise of being able to leverage large, offline datasets for both evaluating and selecting complex policies for decision making.

Continuous Control Decision Making +1

Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach

1 code implementation EMNLP 2021 Haoming Jiang, Bo Dai, Mengjiao Yang, Tuo Zhao, Wei Wei

An ideal environment for evaluating dialog systems, also known as the Turing test, needs to involve human interaction, which is usually not affordable for large-scale experiments.

Model-based Reinforcement Learning reinforcement-learning +1

Representation Matters: Offline Pretraining for Sequential Decision Making

no code implementations ICLR Workshop SSL-RL 2021 Mengjiao Yang, Ofir Nachum

The recent success of supervised learning methods on ever larger offline datasets has spurred interest in the reinforcement learning (RL) field to investigate whether the same paradigms can be translated to RL algorithms.

Decision Making Imitation Learning +1

Offline Policy Selection under Uncertainty

1 code implementation12 Dec 2020 Mengjiao Yang, Bo Dai, Ofir Nachum, George Tucker, Dale Schuurmans

More importantly, we show how the belief distribution estimated by BayesDICE may be used to rank policies with respect to any arbitrary downstream policy selection metric, and we empirically demonstrate that this selection procedure significantly outperforms existing approaches, such as ranking policies according to mean or high-confidence lower bound value estimates.

Off-Policy Evaluation via the Regularized Lagrangian

no code implementations NeurIPS 2020 Mengjiao Yang, Ofir Nachum, Bo Dai, Lihong Li, Dale Schuurmans

The recently proposed distribution correction estimation (DICE) family of estimators has advanced the state of the art in off-policy evaluation from behavior-agnostic data.

Energy-Based Processes for Exchangeable Data

1 code implementation ICML 2020 Mengjiao Yang, Bo Dai, Hanjun Dai, Dale Schuurmans

Recently there has been growing interest in modeling sets with exchangeability such as point clouds.

Denoising Point Cloud Generation

Benchmarking Attribution Methods with Relative Feature Importance

2 code implementations23 Jul 2019 Mengjiao Yang, Been Kim

Despite active development, quantitative evaluation of feature attribution methods remains difficult due to the lack of ground truth: we do not know which input features are in fact important to a model.

Feature Importance

GraphIt - A High-Performance DSL for Graph Analytics

4 code implementations2 May 2018 Yunming Zhang, Mengjiao Yang, Riyadh Baghdadi, Shoaib Kamil, Julian Shun, Saman Amarasinghe

This paper introduces GraphIt, a new DSL for graph computations that generates fast implementations for algorithms with different performance characteristics running on graphs with different sizes and structures.

Programming Languages

Cannot find the paper you are looking for? You can Submit a new open access paper.