Search Results for author: Yu Bai

Found 38 papers, 5 papers with code

Sample-Efficient Learning of Correlated Equilibria in Extensive-Form Games

no code implementations15 May 2022 Ziang Song, Song Mei, Yu Bai

We then design an uncoupled no-regret algorithm that finds an $\varepsilon$-approximate $K$-EFCE within $\widetilde{\mathcal{O}}(\max_{i}X_iA_i^{K}/\varepsilon^2)$ iterations in the full feedback setting, where $X_i$ and $A_i$ are the number of information sets and actions for the $i$-th player.

Non-autoregressive Translation with Dependency-Aware Decoder

1 code implementation30 Mar 2022 Jiaao Zhan, Qian Chen, Boxing Chen, Wen Wang, Yu Bai, Yang Gao

In this paper, we propose a novel and general approach to enhance the target dependency within the NAT decoder from two perspectives: decoder input and decoder self-attention.

Translation

Efficient and Differentiable Conformal Prediction with General Function Classes

no code implementations ICLR 2022 Yu Bai, Song Mei, Huan Wang, Yingbo Zhou, Caiming Xiong

Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly over existing approaches in several applications such as prediction intervals with improved length, minimum-volume prediction sets for multi-output regression, and label prediction sets for image classification.

Image Classification Prediction Intervals

Privacy protection based on mask template

no code implementations13 Feb 2022 Hao Wang, Yu Bai, Guangmin Sun, Jie Liu

Powerful recognition algorithms are widely used in the Internet or important medical systems, which poses a serious threat to personal privacy.

Near-Optimal Learning of Extensive-Form Games with Imperfect Information

no code implementations3 Feb 2022 Yu Bai, Chi Jin, Song Mei, Tiancheng Yu

This improves upon the best known sample complexity of $\widetilde{\mathcal{O}}((X^2A+Y^2B)/\varepsilon^2)$ by a factor of $\widetilde{\mathcal{O}}(\max\{X, Y\})$, and matches the information-theoretic lower bound up to logarithmic factors.

Analyzing Micro-Founded General Equilibrium Models with Many Agents using Deep Reinforcement Learning

no code implementations3 Jan 2022 Michael Curry, Alexander Trott, Soham Phade, Yu Bai, Stephan Zheng

We validate the learned solutions are $\epsilon$-meta-equilibria through best-response analyses, show that they align with economic intuitions, and show our approach can learn a spectrum of qualitatively distinct $\epsilon$-meta-equilibria in open RBC models.

Multi-agent Reinforcement Learning reinforcement-learning

Unifying Cross-lingual Summarization and Machine Translation with Compression Rate

1 code implementation15 Oct 2021 Yu Bai, Heyan Huang, Kai Fan, Yang Gao, Yiming Zhu, Jiaao Zhan, Zewen Chi, Boxing Chen

Through introducing compression rate, the information ratio between the source and the target text, we regard the MT task as a special CLS task with a compression rate of 100%.

Data Augmentation Machine Translation +1

When Can We Learn General-Sum Markov Games with a Large Number of Players Sample-Efficiently?

no code implementations ICLR 2022 Ziang Song, Song Mei, Yu Bai

First, we design algorithms for learning an $\epsilon$-Coarse Correlated Equilibrium (CCE) in $\widetilde{\mathcal{O}}(H^5S\max_{i\le m} A_i / \epsilon^2)$ episodes, and an $\epsilon$-Correlated Equilibrium (CE) in $\widetilde{\mathcal{O}}(H^6S\max_{i\le m} A_i^2 / \epsilon^2)$ episodes.

Multi-agent Reinforcement Learning

Local Calibration: Metrics and Recalibration

no code implementations29 Sep 2021 Rachel Luo, Aadyot Bhatnagar, Yu Bai, Shengjia Zhao, Huan Wang, Caiming Xiong, Silvio Savarese, Stefano Ermon, Edward Schmerling, Marco Pavone

In this work, we propose the local calibration error (LCE) to span the gap between average and individual reliability.

Decision Making Fairness

Cross-Lingual Language Model Meta-Pretraining

no code implementations23 Sep 2021 Zewen Chi, Heyan Huang, Luyang Liu, Yu Bai, Xian-Ling Mao

The success of pretrained cross-lingual language models relies on two essential abilities, i. e., generalization ability for learning downstream tasks in a source language, and cross-lingual transferability for transferring the task knowledge to other languages.

Cross-Lingual Transfer Language Modelling

Understanding the Under-Coverage Bias in Uncertainty Estimation

no code implementations NeurIPS 2021 Yu Bai, Song Mei, Huan Wang, Caiming Xiong

Estimating the data uncertainty in regression tasks is often done by learning a quantile function or a prediction interval of the true label conditioned on the input.

Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning

no code implementations NeurIPS 2021 Tengyang Xie, Nan Jiang, Huan Wang, Caiming Xiong, Yu Bai

This offline result is the first that matches the sample complexity lower bound in this setting, and resolves a recent open question in offline RL.

Offline RL reinforcement-learning

Cross-Lingual Abstractive Summarization with Limited Parallel Resources

1 code implementation ACL 2021 Yu Bai, Yang Gao, Heyan Huang

Employing one unified decoder to generate the sequential concatenation of monolingual and cross-lingual summaries, MCLAS makes the monolingual summarization task a prerequisite of the cross-lingual summarization (CLS) task.

Abstractive Text Summarization Cross-Lingual Abstractive Summarization +1

Multi-modal Trajectory Prediction for Autonomous Driving with Semantic Map and Dynamic Graph Attention Network

no code implementations30 Mar 2021 Bo Dong, Hao liu, Yu Bai, Jinbiao Lin, Zhuoran Xu, Xinyu Xu, Qi Kong

Predicting future trajectories of surrounding obstacles is a crucial task for autonomous driving cars to achieve a high degree of road safety.

Autonomous Driving Graph Attention +1

Exact Gap between Generalization Error and Uniform Convergence in Random Feature Models

no code implementations8 Mar 2021 Zitong Yang, Yu Bai, Song Mei

We show that, in the setting where the classical uniform convergence bound is vacuous (diverges to $\infty$), uniform convergence over the interpolators still gives a non-trivial bound of the test error of interpolating solutions.

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games

no code implementations NeurIPS 2021 Yu Bai, Chi Jin, Huan Wang, Caiming Xiong

Real world applications such as economics and policy making often involve solving multi-agent games with two unique features: (1) The agents are inherently asymmetric and partitioned into leaders and followers; (2) The agents have different reward functions, thus the game is general-sum.

Localized Calibration: Metrics and Recalibration

no code implementations22 Feb 2021 Rachel Luo, Aadyot Bhatnagar, Huan Wang, Caiming Xiong, Silvio Savarese, Yu Bai, Shengjia Zhao, Stefano Ermon

Probabilistic classifiers output confidence scores along with their predictions, and these confidence scores must be well-calibrated (i. e. reflect the true probability of an event) to be meaningful and useful for downstream tasks.

Decision Making

Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification

no code implementations15 Feb 2021 Yu Bai, Song Mei, Huan Wang, Caiming Xiong

Modern machine learning models with high accuracy are often miscalibrated -- the predicted top probability does not reflect the actual accuracy, and tends to be over-confident.

Near-Optimal Offline Reinforcement Learning via Double Variance Reduction

no code implementations NeurIPS 2021 Ming Yin, Yu Bai, Yu-Xiang Wang

Our main result shows that OPDVR provably identifies an $\epsilon$-optimal policy with $\widetilde{O}(H^2/d_m\epsilon^2)$ episodes of offline data in the finite-horizon stationary transition setting, where $H$ is the horizon length and $d_m$ is the minimal marginal state-action distribution induced by the behavior policy.

Offline RL reinforcement-learning

Improved Uncertainty Post-Calibration via Rank Preserving Transforms

no code implementations1 Jan 2021 Yu Bai, Tengyu Ma, Huan Wang, Caiming Xiong

In this paper, we propose Neural Rank Preserving Transforms (NRPT), a new post-calibration method that adjusts the output probabilities of a trained classifier using a calibrator of higher capacity, while maintaining its prediction accuracy.

Text Classification

How Important is the Train-Validation Split in Meta-Learning?

no code implementations12 Oct 2020 Yu Bai, Minshuo Chen, Pan Zhou, Tuo Zhao, Jason D. Lee, Sham Kakade, Huan Wang, Caiming Xiong

A common practice in meta-learning is to perform a train-validation split (\emph{train-val method}) where the prior adapts to the task on one split of the data, and the resulting predictor is evaluated on another split.

Meta-Learning

A Sharp Analysis of Model-based Reinforcement Learning with Self-Play

no code implementations4 Oct 2020 Qinghua Liu, Tiancheng Yu, Yu Bai, Chi Jin

However, for multi-agent reinforcement learning in Markov games, the current best known sample complexity for model-based algorithms is rather suboptimal and compares unfavorably against recent model-free approaches.

Model-based Reinforcement Learning Multi-agent Reinforcement Learning +1

Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning

no code implementations7 Jul 2020 Ming Yin, Yu Bai, Yu-Xiang Wang

The problem of Offline Policy Evaluation (OPE) in Reinforcement Learning (RL) is a critical step towards applying RL in real-life applications.

Offline RL reinforcement-learning

Towards Understanding Hierarchical Learning: Benefits of Neural Representations

no code implementations NeurIPS 2020 Minshuo Chen, Yu Bai, Jason D. Lee, Tuo Zhao, Huan Wang, Caiming Xiong, Richard Socher

When the trainable network is the quadratic Taylor model of a wide two-layer network, we show that neural representation can achieve improved sample complexities compared with the raw input: For learning a low-rank degree-$p$ polynomial ($p \geq 4$) in $d$ dimension, neural representation requires only $\tilde{O}(d^{\lceil p/2 \rceil})$ samples, while the best-known sample complexity upper bound for the raw input is $\tilde{O}(d^{p-1})$.

Near-Optimal Reinforcement Learning with Self-Play

no code implementations NeurIPS 2020 Yu Bai, Chi Jin, Tiancheng Yu

This paper considers the problem of designing optimal algorithms for reinforcement learning in two-player zero-sum games.

Q-Learning reinforcement-learning

Provable Self-Play Algorithms for Competitive Reinforcement Learning

no code implementations ICML 2020 Yu Bai, Chi Jin

We introduce a self-play algorithm---Value Iteration with Upper/Lower Confidence Bound (VI-ULCB)---and show that it achieves regret $\tilde{\mathcal{O}}(\sqrt{T})$ after playing $T$ steps of the game, where the regret is measured by the agent's performance against a \emph{fully adversarial} opponent who can exploit the agent's strategy at \emph{any} step.

reinforcement-learning

Taylorized Training: Towards Better Approximation of Neural Network Training at Finite Width

no code implementations10 Feb 2020 Yu Bai, Ben Krause, Huan Wang, Caiming Xiong, Richard Socher

We propose \emph{Taylorized training} as an initiative towards better understanding neural network training at finite width.

Directed-Weighting Group Lasso for Eltwise Blocked CNN Pruning

no code implementations21 Oct 2019 Ke Zhan, Shimiao Jiang, Yu Bai, Yi Li, Xu Liu, Zhuoran Xu

Eltwise layer is a commonly used structure in the multi-branch deep learning network.

Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks

no code implementations ICLR 2020 Yu Bai, Jason D. Lee

Recent theoretical work has established connections between over-parametrized neural networks and linearized models governed by he Neural Tangent Kernels (NTKs).

Provably Efficient Q-Learning with Low Switching Cost

no code implementations NeurIPS 2019 Yu Bai, Tengyang Xie, Nan Jiang, Yu-Xiang Wang

We take initial steps in studying PAC-MDP algorithms with limited adaptivity, that is, algorithms that change its exploration policy as infrequently as possible during regret minimization.

Q-Learning

Proximal algorithms for constrained composite optimization, with applications to solving low-rank SDPs

no code implementations1 Mar 2019 Yu Bai, John Duchi, Song Mei

We study a family of (potentially non-convex) constrained optimization problems with convex composite structure.

Subgradient Descent Learns Orthogonal Dictionaries

1 code implementation ICLR 2019 Yu Bai, Qijia Jiang, Ju Sun

This paper concerns dictionary learning, i. e., sparse coding, a fundamental representation learning problem.

Dictionary Learning Representation Learning

ProxQuant: Quantized Neural Networks via Proximal Operators

1 code implementation ICLR 2019 Yu Bai, Yu-Xiang Wang, Edo Liberty

To make deep neural networks feasible in resource-constrained environments (such as mobile devices), it is beneficial to quantize models by using low-precision weights.

Quantization

Approximability of Discriminators Implies Diversity in GANs

no code implementations ICLR 2019 Yu Bai, Tengyu Ma, Andrej Risteski

Our preliminary experiments show that on synthetic datasets the test IPM is well correlated with KL divergence or the Wasserstein distance, indicating that the lack of diversity in GANs may be caused by the sub-optimality in optimization instead of statistical inefficiency.

The Landscape of Empirical Risk for Non-convex Losses

no code implementations22 Jul 2016 Song Mei, Yu Bai, Andrea Montanari

We establish uniform convergence of the gradient and Hessian of the empirical risk to their population counterparts, as soon as the number of samples becomes larger than the number of unknown parameters (modulo logarithmic factors).

General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.