Search Results for author: Xuxin Cheng

Found 19 papers, 5 papers with code

Visual Whole-Body Control for Legged Loco-Manipulation

no code implementations25 Mar 2024 Minghuan Liu, Zixuan Chen, Xuxin Cheng, Yandong Ji, Rizhao Qiu, Ruihan Yang, Xiaolong Wang

That is, the robot can control the legs and the arm at the same time to extend its workspace.

Position

Retrieval is Accurate Generation

no code implementations27 Feb 2024 Bowen Cao, Deng Cai, Leyang Cui, Xuxin Cheng, Wei Bi, Yuexian Zou, Shuming Shi

To address this, we propose to initialize the training oracles using linguistic heuristics and, more importantly, bootstrap the oracles through iterative self-reinforcement.

Language Modelling Retrieval +1

Expressive Whole-Body Control for Humanoid Robots

no code implementations26 Feb 2024 Xuxin Cheng, Yandong Ji, Junming Chen, Ruihan Yang, Ge Yang, Xiaolong Wang

Can we enable humanoid robots to generate rich, diverse, and expressive motions in the real world?

Imitation Learning

Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning

1 code implementation30 Jan 2024 Bang Yang, Yong Dai, Xuxin Cheng, Yaowei Li, Asif Raza, Yuexian Zou

To alleviate CF raised by covariate shift and lexical overlap, we further propose a novel approach that ensures the identical distribution of all token embeddings during initialization and regularizes token embedding learning during training.

Text Retrieval

ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding

no code implementations19 Nov 2023 Xuxin Cheng, Bowen Cao, Qichen Ye, Zhihong Zhu, Hongxiang Li, Yuexian Zou

Specifically, in fine-tuning, we apply mutual learning and train two SLU models on the manual transcripts and the ASR transcripts, respectively, aiming to iteratively share knowledge between these two models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Extreme Parkour with Legged Robots

no code implementations25 Sep 2023 Xuxin Cheng, Kexin Shi, Ananye Agarwal, Deepak Pathak

In this paper, we take a similar approach to developing robot parkour on a small low-cost robot with imprecise actuation and a single front-facing depth camera for perception which is low-frequency, jittery, and prone to artifacts.

G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory

1 code implementation ICCV 2023 Hongxiang Li, Meng Cao, Xuxin Cheng, Yaowei Li, Zhihong Zhu, Yuexian Zou

Due to two annoying issues in video grounding: (1) the co-existence of some visual entities in both ground truth and other moments, \ie semantic overlapping; (2) only a few moments in the video are annotated, \ie sparse annotation dilemma, vanilla contrastive learning is unable to model the correlations between temporally distant moments and learned inconsistent video representations.

Contrastive Learning Video Grounding

Unify, Align and Refine: Multi-Level Semantic Alignment for Radiology Report Generation

no code implementations ICCV 2023 Yaowei Li, Bang Yang, Xuxin Cheng, Zhihong Zhu, Hongxiang Li, Yuexian Zou

Automatic radiology report generation has attracted enormous research interest due to its practical value in reducing the workload of radiologists.

Sentence

Legs as Manipulator: Pushing Quadrupedal Agility Beyond Locomotion

no code implementations20 Mar 2023 Xuxin Cheng, Ashish Kumar, Deepak Pathak

Locomotion has seen dramatic progress for walking or running across challenging terrains.

PoseRAC: Pose Saliency Transformer for Repetitive Action Counting

1 code implementation15 Mar 2023 Ziyu Yao, Xuxin Cheng, Yuexian Zou

Moreover, we introduce a pose-level method, PoseRAC, which is based on this representation and achieves state-of-the-art performance on two new version datasets by using Pose Saliency Annotation to annotate salient poses for training.

Repetitive Action Counting

Exploiting Auxiliary Caption for Video Grounding

no code implementations15 Jan 2023 Hongxiang Li, Meng Cao, Xuxin Cheng, Zhihong Zhu, Yaowei Li, Yuexian Zou

Video grounding aims to locate a moment of interest matching the given query sentence from an untrimmed video.

Contrastive Learning Dense Video Captioning +2

A Dynamic Graph Interactive Framework with Label-Semantic Injection for Spoken Language Understanding

1 code implementation8 Nov 2022 Zhihong Zhu, Weiyuan Xu, Xuxin Cheng, Tengtao Song, Yuexian Zou

Multi-intent detection and slot filling joint models are gaining increasing traction since they are closer to complicated real-world scenarios.

Intent Detection slot-filling +2

Deep Whole-Body Control: Learning a Unified Policy for Manipulation and Locomotion

no code implementations18 Oct 2022 Zipeng Fu, Xuxin Cheng, Deepak Pathak

The standard hierarchical control pipeline for such legged manipulators is to decouple the controller into that of manipulation and locomotion.

Automated Lane Change Strategy using Proximal Policy Optimization-based Deep Reinforcement Learning

no code implementations7 Feb 2020 Fei Ye, Xuxin Cheng, Pin Wang, Ching-Yao Chan, Jiucai Zhang

The simulation results demonstrate the lane change maneuvers can be efficiently learned and executed in a safe, smooth, and efficient manner.

Autonomous Driving reinforcement-learning +1

Driving Decision and Control for Autonomous Lane Change based on Deep Reinforcement Learning

no code implementations23 Apr 2019 Tianyu Shi, Pin Wang, Xuxin Cheng, Ching-Yao Chan, Ding Huang

We apply Deep Q-network (DQN) with the consideration of safety during the task for deciding whether to conduct the maneuver.

Autonomous Driving Decision Making +3

Cannot find the paper you are looking for? You can Submit a new open access paper.