no code implementations • 25 Jan 2025 • Zhi-Song Liu, Chenhang He, Lei LI
To address these inefficiencies, we propose PUFM, a flow matching approach to directly map sparse point clouds to their high-fidelity dense counterparts.
no code implementations • 13 Jul 2024 • Ruihuang Li, Zhengqiang Zhang, Chenhang He, Zhiyuan Ma, Vishal M. Patel, Lei Zhang
Recent vision-language pre-training models have exhibited remarkable generalization ability in zero-shot recognition tasks.
2 code implementations • 12 Jul 2024 • Yabin Zhang, Wenjie Zhu, Chenhang He, Lei Zhang
The LAPT framework operates autonomously, requiring only ID class names as input and eliminating the need for manual intervention.
1 code implementation • 15 Jun 2024 • Guowen Zhang, Lue Fan, Chenhang He, Zhen Lei, Zhaoxiang Zhang, Lei Zhang
Inspired by the recent advances of state space models (SSMs), we present a Voxel SSM, termed as Voxel Mamba, which employs a group-free strategy to serialize the whole space of voxels into a single sequence.
no code implementations • 1 Mar 2024 • Weiwei Lin, Chenhang He, Man-Wai Mak, Jiachen Lian, Kong Aik Lee
This forces the model to learn a speaker distribution disentangled from the semantic content.
1 code implementation • 1 Jan 2024 • Chenhang He, Ruihuang Li, Guowen Zhang, Lei Zhang
In this paper, we introduce ScatterFormer, which to the best of our knowledge, is the first to directly apply attention to voxels across different windows as a single sequence.
1 code implementation • 1 Dec 2023 • Xi Yang, Chenhang He, jianqi ma, Lei Zhang
To ensure the content consistency among adjacent frames, we exploit the temporal dynamics in LR videos to guide the diffusion process by optimizing the latent sampling path with a motion-guided loss, ensuring that the generated HR video maintains a coherent and continuous visual flow.
no code implementations • 14 May 2023 • Weiwei Lin, Chenhang He, Man-Wai Mak, Youzhi Tu
Self-supervised learning (SSL) speech models such as wav2vec and HuBERT have demonstrated state-of-the-art performance on automatic speech recognition (ASR) and proved to be extremely useful in low label-resource settings.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • CVPR 2023 • Shuai Li, Minghan Li, Ruihuang Li, Chenhang He, Lei Zhang
The positive and negative weights of these soft anchors are dynamically adjusted during training so that they can contribute more to ``representation learning'' in the early training stage, and contribute more to ``duplicated prediction removal'' in the later stage.
1 code implementation • CVPR 2023 • Chenhang He, Ruihuang Li, Yabin Zhang, Shuai Li, Lei Zhang
Current top-performing multi-frame detectors mostly follow a Detect-and-Fuse framework, which extracts features from each frame of the sequence and fuses them to detect the objects in the current frame.
1 code implementation • CVPR 2023 • Ruihuang Li, Chenhang He, Yabin Zhang, Shuai Li, Liyi Chen, Lei Zhang
Weakly supervised instance segmentation using only bounding box annotations has recently attracted much research attention.
1 code implementation • CVPR 2023 • Ruihuang Li, Chenhang He, Shuai Li, Yabin Zhang, Lei Zhang
The representative instance segmentation methods mostly segment different object instances with a mask of the fixed resolution, e. g., 28*28 grid.
1 code implementation • 7 Jul 2022 • Yabin Zhang, Jiehong Lin, Chenhang He, Yongwei Chen, Kui Jia, Lei Zhang
In this work, we make the first attempt, to the best of our knowledge, to consider the local geometry information explicitly into the masked auto-encoding, and propose a novel Masked Surfel Prediction (MaskSurf) method.
1 code implementation • CVPR 2022 • Chenhang He, Ruihuang Li, Shuai Li, Lei Zhang
VoxSeT is built upon a voxel-based set attention (VSA) module, which reduces the self-attention in each voxel by two cross-attentions and models features in a hidden space induced by a group of latent codes.
1 code implementation • CVPR 2022 • Ruihuang Li, Shuai Li, Chenhang He, Yabin Zhang, Xu Jia, Lei Zhang
One popular solution to this challenging task is self-training, which selects high-scoring predictions on target samples as pseudo labels for training.
Ranked #9 on
Image-to-Image Translation
on SYNTHIA-to-Cityscapes
1 code implementation • CVPR 2022 • Shuai Li, Chenhang He, Ruihuang Li, Lei Zhang
Existing LA methods mostly focus on the design of pos weighting function, while the neg weight is directly derived from the pos weight.
no code implementations • 28 Jul 2021 • Chenhang He, Jianqiang Huang, Xian-Sheng Hua, Lei Zhang
Current geometry-based monocular 3D object detection models can efficiently detect objects by leveraging perspective geometry, but their performance is limited due to the absence of accurate depth information.
1 code implementation • CVPR 2020 • Chenhang He, Hui Zeng, Jianqiang Huang, Xian-Sheng Hua, Lei Zhang
The auxiliary network is jointly optimized, by two point-level supervisions, to guide the convolutional features in the backbone network to be aware of the object structure.
2 code implementations • 24 Jan 2020 • Anjany Sekuboyina, Malek E. Husseini, Amirhossein Bayat, Maximilian Löffler, Hans Liebl, Hongwei Li, Giles Tetteh, Jan Kukačka, Christian Payer, Darko Štern, Martin Urschler, Maodong Chen, Dalong Cheng, Nikolas Lessmann, Yujin Hu, Tianfu Wang, Dong Yang, Daguang Xu, Felix Ambellan, Tamaz Amiranashvili, Moritz Ehlke, Hans Lamecker, Sebastian Lehnert, Marilia Lirio, Nicolás Pérez de Olaguer, Heiko Ramm, Manish Sahu, Alexander Tack, Stefan Zachow, Tao Jiang, Xinjun Ma, Christoph Angerman, Xin Wang, Kevin Brown, Alexandre Kirszenberg, Élodie Puybareau, Di Chen, Yiwei Bai, Brandon H. Rapazzo, Timyoas Yeah, Amber Zhang, Shangliang Xu, Feng Hou, Zhiqiang He, Chan Zeng, Zheng Xiangshang, Xu Liming, Tucker J. Netherton, Raymond P. Mumme, Laurence E. Court, Zixun Huang, Chenhang He, Li-Wen Wang, Sai Ho Ling, Lê Duy Huynh, Nicolas Boutry, Roman Jakubicek, Jiri Chmelik, Supriti Mulay, Mohanasankar Sivaprakasam, Johannes C. Paetzold, Suprosanna Shit, Ivan Ezhov, Benedikt Wiestler, Ben Glocker, Alexander Valentinitsch, Markus Rempfler, Björn H. Menze, Jan S. Kirschke
Two datasets containing a total of 374 multi-detector CT scans from 355 patients were prepared and 4505 vertebrae have individually been annotated at voxel-level by a human-machine hybrid algorithm (https://osf. io/nqjyw/, https://osf. io/t98fz/).