Search Results for author: Ruijie Zheng

Found 23 papers, 11 papers with code

Provably Learning from Language Feedback

no code implementations12 Jun 2025 Wanqiao Xu, Allen Nie, Ruijie Zheng, Aditya Modi, Adith Swaminathan, Ching-An Cheng

Interactively learning from observation and language feedback is an increasingly studied area driven by the emergence of large language model (LLM) agents.

Large Language Model

FLARE: Robot Learning with Implicit World Modeling

no code implementations21 May 2025 Ruijie Zheng, Jing Wang, Scott Reed, Johan Bjorck, Yu Fang, Fengyuan Hu, Joel Jang, Kaushil Kundalia, Zongyu Lin, Loic Magne, Avnish Narayan, You Liang Tan, Guanzhi Wang, Qi Wang, Jiannan Xiang, Yinzhen Xu, Seonghyeon Ye, Jan Kautz, Furong Huang, Yuke Zhu, Linxi Fan

We introduce $\textbf{F}$uture $\textbf{LA}$tent $\textbf{RE}$presentation Alignment ($\textbf{FLARE}$), a novel framework that integrates predictive latent world modeling into robot policy learning.

Imitation Learning Vision-Language-Action

DreamGen: Unlocking Generalization in Robot Learning through Video World Models

1 code implementation19 May 2025 Joel Jang, Seonghyeon Ye, Zongyu Lin, Jiannan Xiang, Johan Bjorck, Yu Fang, Fengyuan Hu, Spencer Huang, Kaushil Kundalia, Yen-Chen Lin, Loic Magne, Ajay Mandlekar, Avnish Narayan, You Liang Tan, Guanzhi Wang, Jing Wang, Qi Wang, Yinzhen Xu, Xiaohui Zeng, Kaiyuan Zheng, Ruijie Zheng, Ming-Yu Liu, Luke Zettlemoyer, Dieter Fox, Jan Kautz, Scott Reed, Yuke Zhu, Linxi Fan

We introduce DreamGen, a simple yet highly effective 4-stage pipeline for training robot policies that generalize across behaviors and environments through neural trajectories - synthetic robot data generated from video world models.

Video Generation

TREND: Tri-teaching for Robust Preference-based Reinforcement Learning with Demonstrations

no code implementations9 May 2025 Shuaiyi Huang, Mara Levy, Anubhav Gupta, Daniel Ekpo, Ruijie Zheng, Abhinav Shrivastava

To address this challenge, we propose TREND, a novel framework that integrates few-shot expert demonstrations with a tri-teaching strategy for effective noise mitigation.

TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies

no code implementations13 Dec 2024 Ruijie Zheng, Yongyuan Liang, Shuaiyi Huang, Jianfeng Gao, Hal Daumé III, Andrey Kolobov, Furong Huang, Jianwei Yang

Although large vision-language-action (VLA) models pretrained on extensive robot datasets offer promising generalist policies for robotic learning, they still struggle with spatial-temporal dynamics in interactive robotics, making them less effective in handling complex tasks, such as manipulation.

Ranked #7 on Robot Manipulation on SimplerEnv-Google Robot (using extra training data)

Robot Manipulation Vision-Language-Action

ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization

no code implementations22 Feb 2024 Tianying Ji, Yongyuan Liang, Yan Zeng, Yu Luo, Guowei Xu, Jiawei Guo, Ruijie Zheng, Furong Huang, Fuchun Sun, Huazhe Xu

The varying significance of distinct primitive behaviors during the policy learning process has been overlooked by prior model-free RL algorithms.

continuous-control Continuous Control +1

PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control

1 code implementation16 Feb 2024 Ruijie Zheng, Ching-An Cheng, Hal Daumé III, Furong Huang, Andrey Kolobov

To do so, we bring a subtle but critical component of LLM training pipelines -- input tokenization via byte pair encoding (BPE) -- to the seemingly distant task of learning skills of variable time span in continuous control domains.

continuous-control Continuous Control +4

Progressively Efficient Learning

no code implementations13 Oct 2023 Ruijie Zheng, Khanh Nguyen, Hal Daumé III, Furong Huang, Karthik Narasimhan

By equipping a learning agent with an abstract, dynamic language and an intrinsic motivation to learn with minimal communication effort, CEIL leads to emergence of a human-like pattern where the learner and the teacher communicate progressively efficiently by exchanging increasingly more abstract intentions.

Imitation Learning Minecraft

COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL

no code implementations11 Oct 2023 Xiyao Wang, Ruijie Zheng, Yanchao Sun, Ruonan Jia, Wichayaporn Wongkamjan, Huazhe Xu, Furong Huang

In this paper, we propose $\texttt{COPlanner}$, a planning-driven framework for model-based methods to address the inaccurately learned dynamics model problem with conservative model rollouts and optimistic environment exploration.

continuous-control Continuous Control +2

Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations

no code implementations22 Jul 2023 Yongyuan Liang, Yanchao Sun, Ruijie Zheng, Xiangyu Liu, Benjamin Eysenbach, Tuomas Sandholm, Furong Huang, Stephen Mcaleer

To tackle this challenge, we propose GRAD, a novel game-theoretic approach that treats the temporally-coupled robust RL problem as a partially observable two-player zero-sum game.

continuous-control Continuous Control +3

Is Imitation All You Need? Generalized Decision-Making with Dual-Phase Training

1 code implementation ICCV 2023 Yao Wei, Yanchao Sun, Ruijie Zheng, Sai Vemprala, Rogerio Bonatti, Shuhang Chen, Ratnesh Madaan, Zhongjie Ba, Ashish Kapoor, Shuang Ma

We introduce DualMind, a generalist agent designed to tackle various decision-making tasks that addresses challenges posed by current methods, such as overfitting behaviors and dependence on task-specific fine-tuning.

All Decision Making

TACO: Temporal Latent Action-Driven Contrastive Loss for Visual Reinforcement Learning

1 code implementation22 Jun 2023 Ruijie Zheng, Xiyao Wang, Yanchao Sun, Shuang Ma, Jieyu Zhao, Huazhe Xu, Hal Daumé III, Furong Huang

Despite recent progress in reinforcement learning (RL) from raw pixel data, sample inefficiency continues to present a substantial obstacle.

continuous-control Continuous Control +4

Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function

no code implementations2 Feb 2023 Ruijie Zheng, Xiyao Wang, Huazhe Xu, Furong Huang

To test this hypothesis, we devise two practical robust training mechanisms through computing the adversarial noise and regularizing the value network's spectral norm to directly regularize the Lipschitz condition of the value functions.

model Model-based Reinforcement Learning

Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning

1 code implementation12 Oct 2022 Yongyuan Liang, Yanchao Sun, Ruijie Zheng, Furong Huang

Recent studies reveal that a well-trained deep reinforcement learning (RL) policy can be particularly vulnerable to adversarial perturbations on input observations.

Deep Reinforcement Learning reinforcement-learning +1

Certifiably Robust Policy Learning against Adversarial Communication in Multi-agent Systems

no code implementations21 Jun 2022 Yanchao Sun, Ruijie Zheng, Parisa Hassanzadeh, Yongyuan Liang, Soheil Feizi, Sumitra Ganesh, Furong Huang

Communication is important in many multi-agent reinforcement learning (MARL) problems for agents to share information and make good decisions.

Multi-agent Reinforcement Learning

Transfer RL across Observation Feature Spaces via Model-Based Regularization

no code implementations ICLR 2022 Yanchao Sun, Ruijie Zheng, Xiyao Wang, Andrew Cohen, Furong Huang

In many reinforcement learning (RL) applications, the observation space is specified by human developers and restricted by physical realizations, and may thus be subject to dramatic changes over time (e. g. increased number of observable features).

Reinforcement Learning (RL)

Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL

1 code implementation ICLR 2022 Yanchao Sun, Ruijie Zheng, Yongyuan Liang, Furong Huang

Existing works on adversarial RL either use heuristics-based methods that may not find the strongest adversary, or directly train an RL-based adversary by treating the agent as a part of the environment, which can find the optimal adversary but may become intractable in a large state space.

MuJoCo Reinforcement Learning (RL)

Cortical Features for Defense Against Adversarial Audio Attacks

1 code implementation30 Jan 2021 Ilya Kavalerov, Ruijie Zheng, Wojciech Czaja, Rama Chellappa

We propose using a computational model of the auditory cortex as a defense against adversarial attacks on audio.

Cannot find the paper you are looking for? You can Submit a new open access paper.