no code implementations • 20 Oct 2024 • Xiaowei Chi, Hengyuan Zhang, Chun-Kai Fan, Xingqun Qi, Rongyu Zhang, Anthony Chen, Chi-Min Chan, Wei Xue, Wenhan Luo, Shanghang Zhang, Yike Guo
Yet, applying the world model for accurate video prediction is quite challenging due to the complex and dynamic intentions of the various scenes in practice.
no code implementations • 16 Sep 2024 • Peng Li, Wangguandong Zheng, YuAn Liu, Tao Yu, Yangguang Li, Xingqun Qi, Mengfei Li, Xiaowei Chi, Siyu Xia, Wei Xue, Wenhan Luo, Qifeng Liu, Yike Guo
Detailed and photorealistic 3D human modeling is essential for various applications and has seen tremendous progress.
no code implementations • 3 Sep 2024 • Chuanyang Ma, Jiangtao Li, Xingqun Qi, Muyi Sun, Huiling Zhou
In this paper, we introduce a systematic framework: Pest Manager for precise pest counting and identification within the invisible grain pile environment.
1 code implementation • 30 Jul 2024 • Xiaowei Chi, Yatian Wang, Aosong Cheng, Pengjun Fang, Zeyue Tian, Yingqing He, Zhaoyang Liu, Xingqun Qi, Jiahao Pan, Rongyu Zhang, Mengfei Li, Ruibin Yuan, Yanbing Jiang, Wei Xue, Wenhan Luo, Qifeng Chen, Shanghang Zhang, Qifeng Liu, Yike Guo
To fulfill this gap, we present MMTrail, a large-scale multi-modality video-language dataset incorporating more than 20M trailer clips with visual captions, and 2M high-quality clips with multimodal captions.
no code implementations • 27 May 2024 • Xingqun Qi, Hengyuan Zhang, Yatian Wang, Jiahao Pan, Chen Liu, Peng Li, Xiaowei Chi, Mengfei Li, Wei Xue, Shanghang Zhang, Wenhan Luo, Qifeng Liu, Yike Guo
Here, we construct the audio ControlNet through a trainable copy of our pre-trained diffusion model.
no code implementations • 19 May 2024 • Peng Li, YuAn Liu, Xiaoxiao Long, Feihu Zhang, Cheng Lin, Mengfei Li, Xingqun Qi, Shanghang Zhang, Wenhan Luo, Ping Tan, Wenping Wang, Qifeng Liu, Yike Guo
Specifically, these methods assume that the input images should comply with a predefined camera type, e. g. a perspective camera with a fixed focal length, leading to distorted shapes when the assumption fails.
1 code implementation • 29 Nov 2023 • Xiaowei Chi, Rongyu Zhang, Zhengkai Jiang, Yijiang Liu, Yatian Wang, Xingqun Qi, Wenhan Luo, Peng Gao, Shanghang Zhang, Qifeng Liu, Yike Guo
Moreover, to further enhance the effectiveness of $M^{3}Adapter$ while preserving the coherence of semantic context comprehension, we introduce a two-stage $M^{3}FT$ fine-tuning strategy.
no code implementations • CVPR 2024 • Xingqun Qi, Jiahao Pan, Peng Li, Ruibin Yuan, Xiaowei Chi, Mengfei Li, Wenhan Luo, Wei Xue, Shanghang Zhang, Qifeng Liu, Yike Guo
In addition, the lack of large-scale available datasets with emotional transition speech and corresponding 3D human gestures also limits the addressing of this task.
no code implementations • 31 Jul 2023 • Chen Liu, Peike Li, Xingqun Qi, Hu Zhang, Lincheng Li, Dadong Wang, Xin Yu
However, we observed that prior arts are prone to segment a certain salient object in a video regardless of the audio information.
1 code implementation • 30 May 2023 • Xingqun Qi, Chen Liu, Lincheng Li, Jie Hou, Haoran Xin, Xin Yu
In this work, we propose EmotionGesture, a novel framework for synthesizing vivid and diverse emotional co-speech 3D gestures from audio.
1 code implementation • CVPR 2023 • Xingqun Qi, Chen Liu, Muyi Sun, Lincheng Li, Changjie Fan, Xin Yu
Considering the asymmetric gestures and motions of two hands, we introduce a Spatial-Residual Memory (SRM) module to model spatial interaction between the body and each hand by residual learning.
no code implementations • 2 Nov 2022 • Hao Dang, Yuekai Zhang, Xingqun Qi, Wanting Zhou, Muyi Sun
To tackle this problem, we propose \textbf{LightVessel}, a Similarity Knowledge Distillation Framework, for lightweight coronary artery vessel segmentation.
1 code implementation • 26 Jul 2022 • Xingqun Qi, Zhuojie Wu, Min Ren, Muyi Sun, Caifeng Shan, Zhenan Sun
Considering the domain-invariant representative vectors in MSAN, we propose two generalizable knowledge distillation schemes for cross-domain distillation, Dual Contrastive Graph Distillation (DCGD) and Domain-Invariant Cross Distillation (DICD).
no code implementations • 6 Apr 2022 • Zhuojie Wu, Xingqun Qi, Zijian Wang, Wanting Zhou, Kun Yuan, Muyi Sun, Zhenan Sun
Furthermore, to better improve the inter-coordination between the corrupted and non-corrupted regions and enhance the intra-coordination in corrupted regions, we design InCo2 Loss, a pair of similarity based losses to constrain the feature consistency.
no code implementations • 8 Feb 2022 • Fan Ji, Muyi Sun, Xingqun Qi, Qi Li, Zhenan Sun
Furthermore, we design a novel Memory Refinement Loss (MR Loss) for feature alignment in the memory module, which enhances the accuracy of memory slots in an unsupervised manner.
no code implementations • 5 Jan 2022 • Xingqun Qi, Muyi Sun, Zijian Wang, Jiaming Liu, Qi Li, Fang Zhao, Shanghang Zhang, Caifeng Shan
To preserve the generated faces being more structure-coordinated, the IRSG models inter-class structural relations among every facial component by graph representation learning.
Generative Adversarial Network
Graph Representation Learning
+1
no code implementations • CVPR 2022 • Zijian Wang, Xingqun Qi, Kun Yuan, Muyi Sun
However, such methods fail to exploit the spatial correlation between the disentangled features.
no code implementations • 29 Jun 2021 • Xingqun Qi, Muyi Sun, Weining Wang, Xiaoxiao Dong, Qi Li, Caifeng Shan
To tackle these challenges, we propose a novel Semantic-Driven Generative Adversarial Network (SDGAN) which embeds global structure-level style injection and local class-level knowledge re-weighting.
no code implementations • IEEE Access 2019 • Hao Dang, Muyi Sun, Guanhong Zhang, Xingqun Qi, Xiaoguang Zhou, Qing Chang
Atrial fibrillation (AF), a common abnormal heartbeat rhythm, is a life-threatening recurrent disease that affects older adults.
Ranked #4 on
Atrial Fibrillation Detection
on MIT-BIH AF
(using extra training data)