no code implementations • 25 Mar 2024 • Minghuan Liu, Zixuan Chen, Xuxin Cheng, Yandong Ji, Rizhao Qiu, Ruihan Yang, Xiaolong Wang
That is, the robot can control the legs and the arm at the same time to extend its workspace.
no code implementations • 27 Feb 2024 • Bowen Cao, Deng Cai, Leyang Cui, Xuxin Cheng, Wei Bi, Yuexian Zou, Shuming Shi
To address this, we propose to initialize the training oracles using linguistic heuristics and, more importantly, bootstrap the oracles through iterative self-reinforcement.
no code implementations • 26 Feb 2024 • Xuxin Cheng, Yandong Ji, Junming Chen, Ruihan Yang, Ge Yang, Xiaolong Wang
Can we enable humanoid robots to generate rich, diverse, and expressive motions in the real world?
1 code implementation • 30 Jan 2024 • Bang Yang, Yong Dai, Xuxin Cheng, Yaowei Li, Asif Raza, Yuexian Zou
To alleviate CF raised by covariate shift and lexical overlap, we further propose a novel approach that ensures the identical distribution of all token embeddings during initialization and regularizes token embedding learning during training.
no code implementations • 19 Nov 2023 • Xuxin Cheng, Bowen Cao, Qichen Ye, Zhihong Zhu, Hongxiang Li, Yuexian Zou
Specifically, in fine-tuning, we apply mutual learning and train two SLU models on the manual transcripts and the ASR transcripts, respectively, aiming to iteratively share knowledge between these two models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • 13 Oct 2023 • Qichen Ye, Junling Liu, Dading Chong, Peilin Zhou, Yining Hua, Fenglin Liu, Meng Cao, ZiMing Wang, Xuxin Cheng, Zhu Lei, Zhenhua Guo
In the CPT and SFT phases, Qilin-Med achieved 38. 4% and 40. 0% accuracy on the CMExam test set, respectively.
no code implementations • 25 Sep 2023 • Xuxin Cheng, Kexin Shi, Ananye Agarwal, Deepak Pathak
In this paper, we take a similar approach to developing robot parkour on a small low-cost robot with imprecise actuation and a single front-facing depth camera for perception which is low-frequency, jittery, and prone to artifacts.
1 code implementation • ICCV 2023 • Hongxiang Li, Meng Cao, Xuxin Cheng, Yaowei Li, Zhihong Zhu, Yuexian Zou
Due to two annoying issues in video grounding: (1) the co-existence of some visual entities in both ground truth and other moments, \ie semantic overlapping; (2) only a few moments in the video are annotated, \ie sparse annotation dilemma, vanilla contrastive learning is unable to model the correlations between temporally distant moments and learned inconsistent video representations.
no code implementations • 5 Jun 2023 • Qianqian Dong, Zhiying Huang, Qiao Tian, Chen Xu, Tom Ko, Yunlong Zhao, Siyuan Feng, Tang Li, Kexin Wang, Xuxin Cheng, Fengpeng Yue, Ye Bai, Xi Chen, Lu Lu, Zejun Ma, Yuping Wang, Mingxuan Wang, Yuxuan Wang
For the speech synthesis part, we adopt the existing VALL-E X approach and build a unit-based audio language model.
no code implementations • ICCV 2023 • Yaowei Li, Bang Yang, Xuxin Cheng, Zhihong Zhu, Hongxiang Li, Yuexian Zou
Automatic radiology report generation has attracted enormous research interest due to its practical value in reducing the workload of radiologists.
no code implementations • 20 Mar 2023 • Xuxin Cheng, Ashish Kumar, Deepak Pathak
Locomotion has seen dramatic progress for walking or running across challenging terrains.
1 code implementation • 15 Mar 2023 • Ziyu Yao, Xuxin Cheng, Yuexian Zou
Moreover, we introduce a pose-level method, PoseRAC, which is based on this representation and achieves state-of-the-art performance on two new version datasets by using Pose Saliency Annotation to annotate salient poses for training.
Ranked #1 on Repetitive Action Counting on RepCount
no code implementations • 15 Jan 2023 • Hongxiang Li, Meng Cao, Xuxin Cheng, Zhihong Zhu, Yaowei Li, Yuexian Zou
Video grounding aims to locate a moment of interest matching the given query sentence from an untrimmed video.
no code implementations • 7 Dec 2022 • Xuxin Cheng, Qianqian Dong, Fengpeng Yue, Tom Ko, Mingxuan Wang, Yuexian Zou
How to solve the data scarcity problem for end-to-end speech-to-text translation (ST)?
1 code implementation • 8 Nov 2022 • Zhihong Zhu, Weiyuan Xu, Xuxin Cheng, Tengtao Song, Yuexian Zou
Multi-intent detection and slot filling joint models are gaining increasing traction since they are closer to complicated real-world scenarios.
no code implementations • 18 Oct 2022 • Zipeng Fu, Xuxin Cheng, Deepak Pathak
The standard hierarchical control pipeline for such legged manipulators is to decouple the controller into that of manipulation and locomotion.
no code implementations • 26 Mar 2021 • Zhongyu Li, Xuxin Cheng, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath
Developing robust walking controllers for bipedal robots is a challenging endeavor.
no code implementations • 7 Feb 2020 • Fei Ye, Xuxin Cheng, Pin Wang, Ching-Yao Chan, Jiucai Zhang
The simulation results demonstrate the lane change maneuvers can be efficiently learned and executed in a safe, smooth, and efficient manner.
no code implementations • 23 Apr 2019 • Tianyu Shi, Pin Wang, Xuxin Cheng, Ching-Yao Chan, Ding Huang
We apply Deep Q-network (DQN) with the consideration of safety during the task for deciding whether to conduct the maneuver.