no code implementations • 7 Feb 2025 • Zhuojie Wu, Heming Du, Shuyun Wang, Ming Lu, Haiyang Sun, Yandong Guo, Xin Yu
In this paper, we propose a hybrid Convolution and State Space Models (SSMs) based image compression framework, termed \textit{CMamba}, to achieve superior rate-distortion performance with low computational complexity.
1 code implementation • 18 Dec 2024 • Zhuo Cao, Bingqing Zhang, Heming Du, Xin Yu, Xue Li, Sen Wang
For short-moment retrieval, FlashVTG increases mAP to 125% of previous SOTA performance.
Ranked #1 on
Highlight Detection
on TvSum
1 code implementation • 25 Oct 2024 • Xin Shen, Lei Shen, Shaozu Yuan, Heming Du, Haiyang Sun, Xin Yu
In this work, we introduce a Diverse Sign Language Translation (DivSLT) task, aiming to generate diverse yet accurate translations for sign language videos.
no code implementations • 25 Oct 2024 • Xin Shen, Heming Du, Hongwei Sheng, Shuyun Wang, Hui Chen, Huiqiang Chen, Zhuojie Wu, Xiaobiao Du, Jiaying Ying, Ruihan Lu, Qingzheng Xu, Xin Yu
Experiment results indicate that MM-WLAuslan is a challenging ISLR dataset, and we hope this dataset will contribute to the development of Auslan and the advancement of sign languages worldwide.
no code implementations • 30 Sep 2024 • Bingqing Zhang, Zhuo Cao, Heming Du, Xin Yu, Xue Li, Jiajun Liu, Sen Wang
Text-Video Retrieval (TVR) methods typically match query-candidate pairs by aligning text and video features in coarse-grained, fine-grained, or combined (coarse-to-fine) manners.
no code implementations • 8 Aug 2024 • Qingbin Zeng, Qinglong Yang, Shunan Dong, Heming Du, Liang Zheng, Fengli Xu, Yong Li
In the absence of navigation instructions, such abilities are vital for the agent to make high-quality decisions in long-range city navigation.
no code implementations • 16 Mar 2024 • Wei zhang, Feng Qiu, Chen Liu, Lincheng Li, Heming Du, Tiancheng Guo, Xin Yu
Affective Behavior Analysis aims to facilitate technology emotionally smart, creating a world where devices can understand and react to our emotions as humans do.
no code implementations • 9 Oct 2023 • Hu Zhang, Xin Shen, Heming Du, Huiqiang Chen, Chen Liu, Hongwei Sheng, Qingzheng Xu, MD Wahiduzzaman Khan, Qingtao Yu, Tianqing Zhu, Scott Chapman, Zi Huang, Xin Yu
In the wheat nutrient deficiencies classification challenge, we present the DividE and EnseMble (DEEM) method for progressive test data predictions.
no code implementations • 2 Sep 2023 • Qingtao Yu, Heming Du, Chen Liu, Xin Yu
CIP-WPIS leverages pretrained knowledge embedded in the 2D foundation model SAM and 3D geometric prior to achieve accurate point-wise instance labels from the bounding box annotations.
no code implementations • CVPR 2023 • Heming Du, Lincheng Li, Zi Huang, Xin Yu
In HiNL, we propose a History-aware State Estimation (HaSE) module to alleviate the impacts of dominant historical states on the current state estimation.
no code implementations • 15 Nov 2022 • Heming Du, Chen Liu, Ming Wang, Lincheng Li, Shunli Zhang, Xin Yu
We measure the uncertainty and predict the match status of the recognition results, and thus determine whether the probe is an OOG query. To the best of our knowledge, our method is the first attempt to tackle OOG queries in gait recognition.
no code implementations • 5 Sep 2022 • Xiaoyu Feng, Heming Du, Yueqi Duan, Yongpan Liu, Hehe Fan
Effectively preserving and encoding structure features from objects in irregular and sparse LiDAR points is a key challenge to 3D object detection on point cloud.
no code implementations • ICLR 2021 • Heming Du, Xin Yu, Liang Zheng
In this paper, we introduce a Visual Transformer Network (VTNet) for learning informative visual representation in navigation.
1 code implementation • ECCV 2020 • Heming Du, Xin Yu, Liang Zheng
Aiming to improve these two components, this paper proposes three complementary techniques, object relation graph (ORG), trial-driven imitation learning (IL), and a memory-augmented tentative policy network (TPN).