Search Results for author: Ruihan Yang

Found 38 papers, 18 papers with code

The Lighthouse of Language: Enhancing LLM Agents via Critique-Guided Improvement

no code implementations20 Mar 2025 Ruihan Yang, Fanghua Ye, Jian Li, Siyu Yuan, Yikai Zhang, Zhaopeng Tu, Xiaolong Li, Deqing Yang

In this work, we introduce Critique-Guided Improvement (CGI), a novel two-player framework, comprising an actor model that explores an environment and a critic model that generates detailed nature language feedback.

WildLMa: Long Horizon Loco-Manipulation in the Wild

no code implementations22 Nov 2024 Ri-Zhao Qiu, Yuchen Song, Xuanbin Peng, Sai Aneesh Suryadevara, Ge Yang, Minghuan Liu, Mazeyu Ji, Chengzhe Jia, Ruihan Yang, Xueyan Zou, Xiaolong Wang

'In-the-wild' mobile manipulation aims to deploy robots in diverse real-world environments, which requires the robot to (1) have skills that generalize across object configurations; (2) be capable of long-horizon task execution in diverse environments; and (3) perform complex manipulation beyond pick-and-place.

Imitation Learning

LoGU: Long-form Generation with Uncertainty Expressions

1 code implementation18 Oct 2024 Ruihan Yang, Caiqi Zhang, Zhisong Zhang, Xinting Huang, Sen yang, Nigel Collier, Dong Yu, Deqing Yang

To tackle these challenges, we propose a refinement-based data collection framework and a two-stage training pipeline.

Form Instruction Following

Atomic Calibration of LLMs in Long-Form Generations

1 code implementation17 Oct 2024 Caiqi Zhang, Ruihan Yang, Zhisong Zhang, Xinting Huang, Sen yang, Dong Yu, Nigel Collier

Existing research on LLM calibration has primarily focused on short-form tasks, providing a single confidence score at the response level (macro calibration).

Form

ACE: A Cross-Platform Visual-Exoskeletons System for Low-Cost Dexterous Teleoperation

no code implementations21 Aug 2024 Shiqi Yang, Minghuan Liu, Yuzhe Qin, Runyu Ding, Jialong Li, Xuxin Cheng, Ruihan Yang, Sha Yi, Xiaolong Wang

Compared to previous systems, which often require hardware customization according to different robots, our single system can generalize to humanoid hands, arm-hands, arm-gripper, and quadruped-gripper systems with high-precision teleoperation.

Imitation Learning

Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

no code implementations3 Jul 2024 Runyu Ding, Yuzhe Qin, Jiyue Zhu, Chengzhe Jia, Shiqi Yang, Ruihan Yang, Xiaojuan Qi, Xiaolong Wang

Our system's ability to handle bimanual manipulations while prioritizing safety and real-time performance makes it a powerful tool for advancing dexterous manipulation and imitation learning.

Imitation Learning

SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals

no code implementations7 Jun 2024 Ruihan Yang, Jiangjie Chen, Yikai Zhang, Siyu Yuan, Aili Chen, Kyle Richardson, Yanghua Xiao, Deqing Yang

Language agents powered by large language models (LLMs) are increasingly valuable as decision-making tools in domains such as gaming and programming.

Decision Making

Fast Samplers for Inverse Problems in Iterative Refinement Models

2 code implementations27 May 2024 Kushagra Pandey, Ruihan Yang, Stephan Mandt

Constructing fast samplers for unconditional diffusion and flow-matching models has received much attention recently; however, existing methods for solving inverse problems, such as super-resolution, inpainting, or deblurring, still require hundreds to thousands of iterative steps to obtain high-quality results.

Deblurring Image Restoration +1

From Persona to Personalization: A Survey on Role-Playing Language Agents

no code implementations28 Apr 2024 Jiangjie Chen, Xintao Wang, Rui Xu, Siyu Yuan, Yikai Zhang, Wei Shi, Jian Xie, Shuang Li, Ruihan Yang, Tinghui Zhu, Aili Chen, Nianqi Li, Lida Chen, Caiyu Hu, Siye Wu, Scott Ren, Ziquan Fu, Yanghua Xiao

Through this work, we aim to establish a clear taxonomy of RPLA research and applications, and facilitate future research in this critical and ever-evolving field, and pave the way for a future where humans and RPLAs coexist in harmony.

In-Context Learning Instruction Following

Visual Whole-Body Control for Legged Loco-Manipulation

no code implementations25 Mar 2024 Minghuan Liu, Zixuan Chen, Xuxin Cheng, Yandong Ji, Ri-Zhao Qiu, Ruihan Yang, Xiaolong Wang

We propose a framework that can conduct the whole-body control autonomously with visual observations.

Position

Learning Generalizable Feature Fields for Mobile Manipulation

no code implementations12 Mar 2024 Ri-Zhao Qiu, Yafei Hu, Yuchen Song, Ge Yang, Yang Fu, Jianglong Ye, Jiteng Mu, Ruihan Yang, Nikolay Atanasov, Sebastian Scherer, Xiaolong Wang

An open problem in mobile manipulation is how to represent objects and scenes in a unified manner so that robots can use both for navigation and manipulation.

Novel View Synthesis

Expressive Whole-Body Control for Humanoid Robots

no code implementations26 Feb 2024 Xuxin Cheng, Yandong Ji, Junming Chen, Ruihan Yang, Ge Yang, Xiaolong Wang

Can we enable humanoid robots to generate rich, diverse, and expressive motions in the real world?

Imitation Learning

GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick

1 code implementation20 Feb 2024 Jiayi Fu, Xuandong Zhao, Ruihan Yang, Yuansen Zhang, Jiangjie Chen, Yanghua Xiao

Large language models (LLMs) excellently generate human-like text, but also raise concerns about misuse in fake news and academic dishonesty.

Diversity Language Modeling +1

Harmonic Mobile Manipulation

no code implementations11 Dec 2023 Ruihan Yang, Yejin Kim, Rose Hendrix, Aniruddha Kembhavi, Xiaolong Wang, Kiana Ehsani

Recent advancements in robotics have enabled robots to navigate complex scenes or manipulate diverse objects independently.

Navigate

Precipitation Downscaling with Spatiotemporal Video Diffusion

no code implementations11 Dec 2023 Prakhar Srivastava, Ruihan Yang, Gavin Kerrigan, Gideon Dresdner, Jeremy McGibbon, Christopher Bretherton, Stephan Mandt

In climate science and meteorology, high-resolution local precipitation (rain and snowfall) predictions are limited by the computational costs of simulation-based methods.

Optical Flow Estimation Super-Resolution

CMMD: Contrastive Multi-Modal Diffusion for Video-Audio Conditional Modeling

no code implementations8 Dec 2023 Ruihan Yang, Hannes Gamper, Sebastian Braun

We introduce a multi-modal diffusion model tailored for the bi-directional conditional generation of video and audio.

Audio Generation

Generalized Animal Imitator: Agile Locomotion with Versatile Motion Prior

no code implementations2 Oct 2023 Ruihan Yang, Zhuoqun Chen, Jianhan Ma, Chongyi Zheng, Yiyu Chen, Quan Nguyen, Xiaolong Wang

This paper introduces the Versatile Instructable Motion prior (VIM) - a Reinforcement Learning framework designed to incorporate a range of agile locomotion tasks suitable for advanced robotic applications.

Neural Volumetric Memory for Visual Locomotion Control

no code implementations CVPR 2023 Ruihan Yang, Ge Yang, Xiaolong Wang

To solve this problem, we follow the paradigm in computer vision that explicitly models the 3D geometry of the scene and propose Neural Volumetric Memory (NVM), a geometric memory architecture that explicitly accounts for the SE(3) equivariance of the 3D world.

3D geometry

Lossy Image Compression with Conditional Diffusion Models

1 code implementation NeurIPS 2023 Ruihan Yang, Stephan Mandt

This paper outlines an end-to-end optimized lossy image compression framework using diffusion generative models.

Decoder Image Compression +1

Diffusion Probabilistic Modeling for Video Generation

1 code implementation16 Mar 2022 Ruihan Yang, Prakhar Srivastava, Stephan Mandt

Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation.

Denoising Image Generation +2

SC2 Benchmark: Supervised Compression for Split Computing

1 code implementation16 Mar 2022 Yoshitomo Matsubara, Ruihan Yang, Marco Levorato, Stephan Mandt

With the increasing demand for deep learning models on mobile devices, splitting neural network computation between the device and a more powerful edge server has become an attractive solution.

Data Compression Edge-computing +2

Vision-Guided Quadrupedal Locomotion in the Wild with Multi-Modal Delay Randomization

1 code implementation29 Sep 2021 Chieko Sarah Imai, Minghao Zhang, Yuchen Zhang, Marcin Kierebinski, Ruihan Yang, Yuzhe Qin, Xiaolong Wang

While Reinforcement Learning (RL) provides a promising paradigm for agile locomotion skills with vision inputs in simulation, it is still very challenging to deploy the RL policy in the real world.

Reinforcement Learning (RL)

Supervised Compression for Resource-Constrained Edge Computing Systems

2 code implementations21 Aug 2021 Yoshitomo Matsubara, Ruihan Yang, Marco Levorato, Stephan Mandt

There has been much interest in deploying deep learning algorithms on low-powered devices, including smartphones, drones, and medical sensors.

Data Compression Edge-computing +2

DexMV: Imitation Learning for Dexterous Manipulation from Human Videos

1 code implementation12 Aug 2021 Yuzhe Qin, Yueh-Hua Wu, Shaowei Liu, Hanwen Jiang, Ruihan Yang, Yang Fu, Xiaolong Wang

While significant progress has been made on understanding hand-object interactions in computer vision, it is still very challenging for robots to perform complex dexterous manipulation.

Imitation Learning motion retargeting +1

Insights from Generative Modeling for Neural Video Compression

1 code implementation28 Jul 2021 Ruihan Yang, Yibo Yang, Joseph Marino, Stephan Mandt

While recent machine learning research has revealed connections between deep generative models such as VAEs and rate-distortion losses used in learned compression, most of this work has focused on images.

Video Compression

Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers

1 code implementation ICLR 2022 Ruihan Yang, Minghao Zhang, Nicklas Hansen, Huazhe Xu, Xiaolong Wang

Our key insight is that proprioceptive states only offer contact measurements for immediate reaction, whereas an agent equipped with visual sensory observations can learn to proactively maneuver environments with obstacles and uneven terrain by anticipating changes in the environment many steps ahead.

Reinforcement Learning (RL)

SCALE SPACE FLOW WITH AUTOREGRESSIVE PRIORS

no code implementations ICLR Workshop Neural_Compression 2021 Ruihan Yang, Yibo Yang, Joseph Marino, Stephan Mandt

There has been a recent surge of interest in neural video compression models that combines data-driven dimensionality reduction with learned entropy coding.

Dimensionality Reduction Open-Ended Question Answering +1

Generative Video Compression as Hierarchical Variational Inference

no code implementations pproximateinference AABI Symposium 2021 Ruihan Yang, Yibo Yang, Joseph Marino, Stephan Mandt

Recent work by Marino et al. (2020) showed improved performance in sequential density estimation by combining masked autoregressive flows with hierarchical latent variable models.

Density Estimation Variational Inference +1

Hierarchical Autoregressive Modeling for Neural Video Compression

3 code implementations ICLR 2021 Ruihan Yang, Yibo Yang, Joseph Marino, Stephan Mandt

Recent work by Marino et al. (2020) showed improved performance in sequential density estimation by combining masked autoregressive flows with hierarchical latent variable models.

Density Estimation Video Compression

PIANOTREE VAE: Structured Representation Learning for Polyphonic Music

2 code implementations17 Aug 2020 Ziyu Wang, Yiyi Zhang, Yixiao Zhang, Junyan Jiang, Ruihan Yang, Junbo Zhao, Gus Xia

The dominant approach for music representation learning involves the deep unsupervised model family variational autoencoder (VAE).

Music Generation Representation Learning

Suphx: Mastering Mahjong with Deep Reinforcement Learning

no code implementations30 Mar 2020 Junjie Li, Sotetsu Koyamada, Qiwei Ye, Guoqing Liu, Chao Wang, Ruihan Yang, Li Zhao, Tao Qin, Tie-Yan Liu, Hsiao-Wuen Hon

Artificial Intelligence (AI) has achieved great success in many domains, and game AI is widely regarded as its beachhead since the dawn of AI.

Deep Reinforcement Learning reinforcement-learning +1

Multi-Task Reinforcement Learning with Soft Modularization

1 code implementation NeurIPS 2020 Ruihan Yang, Huazhe Xu, Yi Wu, Xiaolong Wang

While training multiple tasks jointly allow the policies to share parameters across different tasks, the optimization problem becomes non-trivial: It remains unclear what parameters in the network should be reused across tasks, and how the gradients from different tasks may interfere with each other.

Meta-Learning Multi-Task Learning +3

Deep Music Analogy Via Latent Representation Disentanglement

3 code implementations9 Jun 2019 Ruihan Yang, Dingsu Wang, Ziyu Wang, Tianyao Chen, Junyan Jiang, Gus Xia

Analogy-making is a key method for computer algorithms to generate both natural and creative music pieces.

Disentanglement Rhythm

Inspecting and Interacting with Meaningful Music Representations using VAE

no code implementations18 Apr 2019 Ruihan Yang, Tianyao Chen, Yiyi Zhang, Gus Xia

Variational Autoencoders(VAEs) have already achieved great results on image generation and recently made promising progress on music generation.

Disentanglement Image Generation +2

MatchBench: An Evaluation of Feature Matchers

no code implementations7 Aug 2018 Jia-Wang Bian, Ruihan Yang, Yun Liu, Le Zhang, Ming-Ming Cheng, Ian Reid, WenHai Wu

This leads to a critical absence in this field that there is no standard datasets and evaluation metrics to evaluate different feature matchers fairly.

Cannot find the paper you are looking for? You can Submit a new open access paper.