Search Results for author: Yuexiang Zhai

Found 18 papers, 8 papers with code

Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement

no code implementations24 Feb 2024 Ruiqi Zhang, Yuexiang Zhai, Andrea Zanette

Surprisingly, in this work, we demonstrate that even in such a data-starved setting it may still be possible to find a policy competitive with the optimal one.

Decision Making Multi-Armed Bandits

Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs

1 code implementation11 Jan 2024 Shengbang Tong, Zhuang Liu, Yuexiang Zhai, Yi Ma, Yann Lecun, Saining Xie

To understand the roots of these errors, we explore the gap between the visual embedding space of CLIP and vision-only self-supervised learning.

Representation Learning Self-Supervised Learning +1

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

1 code implementation30 Nov 2023 Marwa Abdulhai, Isadora White, Charlie Snell, Charles Sun, Joey Hong, Yuexiang Zhai, Kelvin Xu, Sergey Levine

Developing such algorithms requires tasks that can gauge progress on algorithm design, provide accessible and reproducible evaluations for multi-turn interactions, and cover a range of task properties and challenges in improving reinforcement learning algorithms.

reinforcement-learning Text Generation

White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?

1 code implementation22 Nov 2023 Yaodong Yu, Sam Buchanan, Druv Pai, Tianzhe Chu, Ziyang Wu, Shengbang Tong, Hao Bai, Yuexiang Zhai, Benjamin D. Haeffele, Yi Ma

This leads to a family of white-box transformer-like deep network architectures, named CRATE, which are mathematically fully interpretable.

Data Compression Denoising +1

RLIF: Interactive Imitation Learning as Reinforcement Learning

no code implementations21 Nov 2023 Jianlan Luo, Perry Dong, Yuexiang Zhai, Yi Ma, Sergey Levine

We also provide a unified framework to analyze our RL method and DAgger; for which we present the asymptotic analysis of the suboptimal gap for both methods as well as the non-asymptotic sample complexity bound of our method.

Continuous Control Imitation Learning +1

Investigating the Catastrophic Forgetting in Multimodal Large Language Models

no code implementations19 Sep 2023 Yuexiang Zhai, Shengbang Tong, Xiao Li, Mu Cai, Qing Qu, Yong Jae Lee, Yi Ma

However, catastrophic forgetting, a notorious phenomenon where the fine-tuned model fails to retain similar performance compared to the pre-trained model, still remains an inherent problem in multimodal LLMs (MLLM).

Image Classification Language Modelling +1

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning

2 code implementations NeurIPS 2023 Mitsuhiko Nakamoto, Yuexiang Zhai, Anikait Singh, Max Sobol Mark, Yi Ma, Chelsea Finn, Aviral Kumar, Sergey Levine

Our approach, calibrated Q-learning (Cal-QL), accomplishes this by learning a conservative value function initialization that underestimates the value of the learned policy from offline data, while also being calibrated, in the sense that the learned Q-values are at a reasonable scale.

Offline RL Q-Learning +1

Closed-Loop Transcription via Convolutional Sparse Coding

no code implementations18 Feb 2023 Xili Dai, Ke Chen, Shengbang Tong, Jingyuan Zhang, Xingjian Gao, Mingyang Li, Druv Pai, Yuexiang Zhai, Xiaojun Yuan, Heung-Yeung Shum, Lionel M. Ni, Yi Ma

Our method is arguably the first to demonstrate that a concatenation of multiple convolution sparse coding/decoding layers leads to an interpretable and effective autoencoder for modeling the distribution of large-scale natural image datasets.

Rolling Shutter Correction

Understanding the Complexity Gains of Single-Task RL with a Curriculum

no code implementations24 Dec 2022 Qiyang Li, Yuexiang Zhai, Yi Ma, Sergey Levine

Under mild regularity conditions on the curriculum, we show that sequentially solving each task in the multi-task RL problem is more computationally efficient than solving the original single-task problem, without any explicit exploration bonuses or other exploration strategies.

Reinforcement Learning (RL)

Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample Complexity

no code implementations18 Oct 2022 Abhishek Gupta, Aldo Pacchiano, Yuexiang Zhai, Sham M. Kakade, Sergey Levine

Reinforcement learning provides an automated framework for learning behaviors from high-level reward specifications, but in practice the choice of reward function can be crucial for good results -- while in principle the reward only needs to specify what the task is, in reality practitioners often need to design more detailed rewards that provide the agent with some hints about how the task should be completed.

reinforcement-learning Reinforcement Learning (RL)

Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning

1 code implementation8 Jul 2021 Yuexiang Zhai, Christina Baek, Zhengyuan Zhou, Jiantao Jiao, Yi Ma

In both OWSP and OWMP settings, we demonstrate that adding {\em intermediate rewards} to subgoals is more computationally efficient than only rewarding the agent once it completes the goal of reaching a terminal state.

Hierarchical Reinforcement Learning Q-Learning +1

Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

1 code implementation NeurIPS 2021 Sheng Liu, Xiao Li, Yuexiang Zhai, Chong You, Zhihui Zhu, Carlos Fernandez-Granda, Qing Qu

Furthermore, we show that our ConvNorm can reduce the layerwise spectral norm of the weight matrices and hence improve the Lipschitzness of the network, leading to easier training and improved robustness for deep ConvNets.

Generative Adversarial Network

Measuring GAN Training in Real Time

no code implementations1 Jan 2021 Yuexiang Zhai, Bai Jiang, Yi Ma, Hao Chen

Generative Adversarial Networks (GAN) are popular generative models of images.

Analysis of the Optimization Landscapes for Overcomplete Representation Learning

no code implementations5 Dec 2019 Qing Qu, Yuexiang Zhai, Xiao Li, Yuqian Zhang, Zhihui Zhu

In this work, we show these problems can be formulated as $\ell^4$-norm optimization problems with spherical constraint, and study the geometric properties of their nonconvex optimization landscapes.

Representation Learning

Complete Dictionary Learning via $\ell^4$-Norm Maximization over the Orthogonal Group

no code implementations6 Jun 2019 Yuexiang Zhai, Zitong Yang, Zhenyu Liao, John Wright, Yi Ma

Most existing methods solve the dictionary (and sparse representations) based on heuristic algorithms, usually without theoretical guarantees for either optimality or complexity.

Dictionary Learning

Learning to Reconstruct 3D Manhattan Wireframes from a Single Image

2 code implementations ICCV 2019 Yichao Zhou, Haozhi Qi, Yuexiang Zhai, Qi Sun, Zhili Chen, Li-Yi Wei, Yi Ma

In this paper, we propose a method to obtain a compact and accurate 3D wireframe representation from a single image by effectively exploiting global structural regularities.

Cannot find the paper you are looking for? You can Submit a new open access paper.