no code implementations • 20 Mar 2025 • Ruihan Yang, Fanghua Ye, Jian Li, Siyu Yuan, Yikai Zhang, Zhaopeng Tu, Xiaolong Li, Deqing Yang
In this work, we introduce Critique-Guided Improvement (CGI), a novel two-player framework, comprising an actor model that explores an environment and a critic model that generates detailed nature language feedback.
no code implementations • 22 Nov 2024 • Ri-Zhao Qiu, Yuchen Song, Xuanbin Peng, Sai Aneesh Suryadevara, Ge Yang, Minghuan Liu, Mazeyu Ji, Chengzhe Jia, Ruihan Yang, Xueyan Zou, Xiaolong Wang
'In-the-wild' mobile manipulation aims to deploy robots in diverse real-world environments, which requires the robot to (1) have skills that generalize across object configurations; (2) be capable of long-horizon task execution in diverse environments; and (3) perform complex manipulation beyond pick-and-place.
1 code implementation • 18 Oct 2024 • Ruihan Yang, Caiqi Zhang, Zhisong Zhang, Xinting Huang, Sen yang, Nigel Collier, Dong Yu, Deqing Yang
To tackle these challenges, we propose a refinement-based data collection framework and a two-stage training pipeline.
1 code implementation • 17 Oct 2024 • Caiqi Zhang, Ruihan Yang, Zhisong Zhang, Xinting Huang, Sen yang, Dong Yu, Nigel Collier
Existing research on LLM calibration has primarily focused on short-form tasks, providing a single confidence score at the response level (macro calibration).
no code implementations • 21 Aug 2024 • Shiqi Yang, Minghuan Liu, Yuzhe Qin, Runyu Ding, Jialong Li, Xuxin Cheng, Ruihan Yang, Sha Yi, Xiaolong Wang
Compared to previous systems, which often require hardware customization according to different robots, our single system can generalize to humanoid hands, arm-hands, arm-gripper, and quadruped-gripper systems with high-precision teleoperation.
no code implementations • 3 Jul 2024 • Runyu Ding, Yuzhe Qin, Jiyue Zhu, Chengzhe Jia, Shiqi Yang, Ruihan Yang, Xiaojuan Qi, Xiaolong Wang
Our system's ability to handle bimanual manipulations while prioritizing safety and real-time performance makes it a powerful tool for advancing dexterous manipulation and imitation learning.
no code implementations • 7 Jun 2024 • Ruihan Yang, Jiangjie Chen, Yikai Zhang, Siyu Yuan, Aili Chen, Kyle Richardson, Yanghua Xiao, Deqing Yang
Language agents powered by large language models (LLMs) are increasingly valuable as decision-making tools in domains such as gaming and programming.
1 code implementation • 3 Jun 2024 • An-Chieh Cheng, Hongxu Yin, Yang Fu, Qiushan Guo, Ruihan Yang, Jan Kautz, Xiaolong Wang, Sifei Liu
Vision Language Models (VLMs) have demonstrated remarkable performance in 2D vision and language tasks.
2 code implementations • 27 May 2024 • Kushagra Pandey, Ruihan Yang, Stephan Mandt
Constructing fast samplers for unconditional diffusion and flow-matching models has received much attention recently; however, existing methods for solving inverse problems, such as super-resolution, inpainting, or deblurring, still require hundreds to thousands of iterative steps to obtain high-quality results.
no code implementations • 28 Apr 2024 • Jiangjie Chen, Xintao Wang, Rui Xu, Siyu Yuan, Yikai Zhang, Wei Shi, Jian Xie, Shuang Li, Ruihan Yang, Tinghui Zhu, Aili Chen, Nianqi Li, Lida Chen, Caiyu Hu, Siye Wu, Scott Ren, Ziquan Fu, Yanghua Xiao
Through this work, we aim to establish a clear taxonomy of RPLA research and applications, and facilitate future research in this critical and ever-evolving field, and pave the way for a future where humans and RPLAs coexist in harmony.
no code implementations • 25 Mar 2024 • Minghuan Liu, Zixuan Chen, Xuxin Cheng, Yandong Ji, Ri-Zhao Qiu, Ruihan Yang, Xiaolong Wang
We propose a framework that can conduct the whole-body control autonomously with visual observations.
no code implementations • 12 Mar 2024 • Ri-Zhao Qiu, Yafei Hu, Yuchen Song, Ge Yang, Yang Fu, Jianglong Ye, Jiteng Mu, Ruihan Yang, Nikolay Atanasov, Sebastian Scherer, Xiaolong Wang
An open problem in mobile manipulation is how to represent objects and scenes in a unified manner so that robots can use both for navigation and manipulation.
no code implementations • 26 Feb 2024 • Xuxin Cheng, Yandong Ji, Junming Chen, Ruihan Yang, Ge Yang, Xiaolong Wang
Can we enable humanoid robots to generate rich, diverse, and expressive motions in the real world?
1 code implementation • 20 Feb 2024 • Jiayi Fu, Xuandong Zhao, Ruihan Yang, Yuansen Zhang, Jiangjie Chen, Yanghua Xiao
Large language models (LLMs) excellently generate human-like text, but also raise concerns about misuse in fake news and academic dishonesty.
no code implementations • 11 Dec 2023 • Ruihan Yang, Yejin Kim, Rose Hendrix, Aniruddha Kembhavi, Xiaolong Wang, Kiana Ehsani
Recent advancements in robotics have enabled robots to navigate complex scenes or manipulate diverse objects independently.
no code implementations • 11 Dec 2023 • Prakhar Srivastava, Ruihan Yang, Gavin Kerrigan, Gideon Dresdner, Jeremy McGibbon, Christopher Bretherton, Stephan Mandt
In climate science and meteorology, high-resolution local precipitation (rain and snowfall) predictions are limited by the computational costs of simulation-based methods.
no code implementations • 8 Dec 2023 • Ruihan Yang, Hannes Gamper, Sebastian Braun
We introduce a multi-modal diffusion model tailored for the bi-directional conditional generation of video and audio.
no code implementations • 2 Oct 2023 • Ruihan Yang, Zhuoqun Chen, Jianhan Ma, Chongyi Zheng, Yiyu Chen, Quan Nguyen, Xiaolong Wang
This paper introduces the Versatile Instructable Motion prior (VIM) - a Reinforcement Learning framework designed to incorporate a range of agile locomotion tasks suitable for advanced robotic applications.
no code implementations • CVPR 2023 • Ruihan Yang, Ge Yang, Xiaolong Wang
To solve this problem, we follow the paradigm in computer vision that explicitly models the 3D geometry of the scene and propose Neural Volumetric Memory (NVM), a geometric memory architecture that explicitly accounts for the SE(3) equivariance of the 3D world.
1 code implementation • NeurIPS 2023 • Ruihan Yang, Stephan Mandt
This paper outlines an end-to-end optimized lossy image compression framework using diffusion generative models.
1 code implementation • 16 Mar 2022 • Ruihan Yang, Prakhar Srivastava, Stephan Mandt
Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation.
1 code implementation • 16 Mar 2022 • Yoshitomo Matsubara, Ruihan Yang, Marco Levorato, Stephan Mandt
With the increasing demand for deep learning models on mobile devices, splitting neural network computation between the device and a more powerful edge server has become an attractive solution.
1 code implementation • 29 Sep 2021 • Chieko Sarah Imai, Minghao Zhang, Yuchen Zhang, Marcin Kierebinski, Ruihan Yang, Yuzhe Qin, Xiaolong Wang
While Reinforcement Learning (RL) provides a promising paradigm for agile locomotion skills with vision inputs in simulation, it is still very challenging to deploy the RL policy in the real world.
2 code implementations • 21 Aug 2021 • Yoshitomo Matsubara, Ruihan Yang, Marco Levorato, Stephan Mandt
There has been much interest in deploying deep learning algorithms on low-powered devices, including smartphones, drones, and medical sensors.
1 code implementation • 12 Aug 2021 • Yuzhe Qin, Yueh-Hua Wu, Shaowei Liu, Hanwen Jiang, Ruihan Yang, Yang Fu, Xiaolong Wang
While significant progress has been made on understanding hand-object interactions in computer vision, it is still very challenging for robots to perform complex dexterous manipulation.
1 code implementation • 28 Jul 2021 • Ruihan Yang, Yibo Yang, Joseph Marino, Stephan Mandt
While recent machine learning research has revealed connections between deep generative models such as VAEs and rate-distortion losses used in learned compression, most of this work has focused on images.
1 code implementation • ICLR 2022 • Ruihan Yang, Minghao Zhang, Nicklas Hansen, Huazhe Xu, Xiaolong Wang
Our key insight is that proprioceptive states only offer contact measurements for immediate reaction, whereas an agent equipped with visual sensory observations can learn to proactively maneuver environments with obstacles and uneven terrain by anticipating changes in the environment many steps ahead.
no code implementations • ICLR Workshop Neural_Compression 2021 • Ruihan Yang, Yibo Yang, Joseph Marino, Stephan Mandt
There has been a recent surge of interest in neural video compression models that combines data-driven dimensionality reduction with learned entropy coding.
no code implementations • pproximateinference AABI Symposium 2021 • Ruihan Yang, Yibo Yang, Joseph Marino, Stephan Mandt
Recent work by Marino et al. (2020) showed improved performance in sequential density estimation by combining masked autoregressive flows with hierarchical latent variable models.
3 code implementations • ICLR 2021 • Ruihan Yang, Yibo Yang, Joseph Marino, Stephan Mandt
Recent work by Marino et al. (2020) showed improved performance in sequential density estimation by combining masked autoregressive flows with hierarchical latent variable models.
2 code implementations • 17 Aug 2020 • Ziyu Wang, Yiyi Zhang, Yixiao Zhang, Junyan Jiang, Ruihan Yang, Junbo Zhao, Gus Xia
The dominant approach for music representation learning involves the deep unsupervised model family variational autoencoder (VAE).
no code implementations • 30 Mar 2020 • Junjie Li, Sotetsu Koyamada, Qiwei Ye, Guoqing Liu, Chao Wang, Ruihan Yang, Li Zhao, Tao Qin, Tie-Yan Liu, Hsiao-Wuen Hon
Artificial Intelligence (AI) has achieved great success in many domains, and game AI is widely regarded as its beachhead since the dawn of AI.
1 code implementation • NeurIPS 2020 • Ruihan Yang, Huazhe Xu, Yi Wu, Xiaolong Wang
While training multiple tasks jointly allow the policies to share parameters across different tasks, the optimization problem becomes non-trivial: It remains unclear what parameters in the network should be reused across tasks, and how the gradients from different tasks may interfere with each other.
Ranked #1 on
Meta-Learning
on MT50
3 code implementations • 9 Jun 2019 • Ruihan Yang, Dingsu Wang, Ziyu Wang, Tianyao Chen, Junyan Jiang, Gus Xia
Analogy-making is a key method for computer algorithms to generate both natural and creative music pieces.
no code implementations • 28 May 2019 • Ruihan Yang, Qiwei Ye, Tie-Yan Liu
Based on that, We proposed an end-to-end algorithm to learn exploration policy by meta-learning.
no code implementations • 18 Apr 2019 • Ruihan Yang, Tianyao Chen, Yiyi Zhang, Gus Xia
Variational Autoencoders(VAEs) have already achieved great results on image generation and recently made promising progress on music generation.
1 code implementation • 7 Feb 2019 • Łukasz Kidziński, Carmichael Ong, Sharada Prasanna Mohanty, Jennifer Hicks, Sean F. Carroll, Bo Zhou, Hongsheng Zeng, Fan Wang, Rongzhong Lian, Hao Tian, Wojciech Jaśkowski, Garrett Andersen, Odd Rune Lykkebø, Nihat Engin Toklu, Pranav Shyam, Rupesh Kumar Srivastava, Sergey Kolesnikov, Oleksii Hrinchuk, Anton Pechenko, Mattias Ljungström, Zhen Wang, Xu Hu, Zehong Hu, Minghui Qiu, Jun Huang, Aleksei Shpilman, Ivan Sosin, Oleg Svidchenko, Aleksandra Malysheva, Daniel Kudenko, Lance Rane, Aditya Bhatt, Zhengfei Wang, Penghui Qi, Zeyang Yu, Peng Peng, Quan Yuan, Wenxin Li, Yunsheng Tian, Ruihan Yang, Pingchuan Ma, Shauharda Khadka, Somdeb Majumdar, Zach Dwiel, Yinyin Liu, Evren Tumer, Jeremy Watson, Marcel Salathé, Sergey Levine, Scott Delp
In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector.
no code implementations • 7 Aug 2018 • Jia-Wang Bian, Ruihan Yang, Yun Liu, Le Zhang, Ming-Ming Cheng, Ian Reid, WenHai Wu
This leads to a critical absence in this field that there is no standard datasets and evaluation metrics to evaluate different feature matchers fairly.