Search Results for author: Heming Du

Found 14 papers, 3 papers with code

CMamba: Learned Image Compression with State Space Models

no code implementations7 Feb 2025 Zhuojie Wu, Heming Du, Shuyun Wang, Ming Lu, Haiyang Sun, Yandong Guo, Xin Yu

In this paper, we propose a hybrid Convolution and State Space Models (SSMs) based image compression framework, termed \textit{CMamba}, to achieve superior rate-distortion performance with low computational complexity.

Image Compression State Space Models

Diverse Sign Language Translation

1 code implementation25 Oct 2024 Xin Shen, Lei Shen, Shaozu Yuan, Heming Du, Haiyang Sun, Xin Yu

In this work, we introduce a Diverse Sign Language Translation (DivSLT) task, aiming to generate diverse yet accurate translations for sign language videos.

Sign Language Translation Translation +1

MM-WLAuslan: Multi-View Multi-Modal Word-Level Australian Sign Language Recognition Dataset

no code implementations25 Oct 2024 Xin Shen, Heming Du, Hongwei Sheng, Shuyun Wang, Hui Chen, Huiqiang Chen, Zhuojie Wu, Xiaobiao Du, Jiaying Ying, Ruihan Lu, Qingzheng Xu, Xin Yu

Experiment results indicate that MM-WLAuslan is a challenging ISLR dataset, and we hope this dataset will contribute to the development of Auslan and the advancement of sign languages worldwide.

Sign Language Recognition

TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm

no code implementations30 Sep 2024 Bingqing Zhang, Zhuo Cao, Heming Du, Xin Yu, Xue Li, Jiajun Liu, Sen Wang

Text-Video Retrieval (TVR) methods typically match query-candidate pairs by aligning text and video features in coarse-grained, fine-grained, or combined (coarse-to-fine) manners.

Retrieval Video Retrieval

Perceive, Reflect, and Plan: Designing LLM Agent for Goal-Directed City Navigation without Instructions

no code implementations8 Aug 2024 Qingbin Zeng, Qinglong Yang, Shunan Dong, Heming Du, Liang Zheng, Fengli Xu, Yong Li

In the absence of navigation instructions, such abilities are vital for the agent to make high-quality decisions in long-range city navigation.

AI Agent Navigate

Affective Behaviour Analysis via Integrating Multi-Modal Knowledge

no code implementations16 Mar 2024 Wei zhang, Feng Qiu, Chen Liu, Lincheng Li, Heming Du, Tiancheng Guo, Xin Yu

Affective Behavior Analysis aims to facilitate technology emotionally smart, creating a world where devices can understand and react to our emotions as humans do.

Divide and Ensemble: Progressively Learning for the Unknown

no code implementations9 Oct 2023 Hu Zhang, Xin Shen, Heming Du, Huiqiang Chen, Chen Liu, Hongwei Sheng, Qingzheng Xu, MD Wahiduzzaman Khan, Qingtao Yu, Tianqing Zhu, Scott Chapman, Zi Huang, Xin Yu

In the wheat nutrient deficiencies classification challenge, we present the DividE and EnseMble (DEEM) method for progressive test data predictions.

When 3D Bounding-Box Meets SAM: Point Cloud Instance Segmentation with Weak-and-Noisy Supervision

no code implementations2 Sep 2023 Qingtao Yu, Heming Du, Chen Liu, Xin Yu

CIP-WPIS leverages pretrained knowledge embedded in the 2D foundation model SAM and 3D geometric prior to achieve accurate point-wise instance labels from the bounding box annotations.

Instance Segmentation Semantic Segmentation

Object-Goal Visual Navigation via Effective Exploration of Relations Among Historical Navigation States

no code implementations CVPR 2023 Heming Du, Lincheng Li, Zi Huang, Xin Yu

In HiNL, we propose a History-aware State Estimation (HaSE) module to alleviate the impacts of dominant historical states on the current state estimation.

valid Visual Navigation

Evidence-based Match-status-Aware Gait Recognition for Out-of-Gallery Gait Identification

no code implementations15 Nov 2022 Heming Du, Chen Liu, Ming Wang, Lincheng Li, Shunli Zhang, Xin Yu

We measure the uncertainty and predict the match status of the recognition results, and thus determine whether the probe is an OOG query. To the best of our knowledge, our method is the first attempt to tackle OOG queries in gait recognition.

Gait Identification Gait Recognition +1

SEFormer: Structure Embedding Transformer for 3D Object Detection

no code implementations5 Sep 2022 Xiaoyu Feng, Heming Du, Yueqi Duan, Yongpan Liu, Hehe Fan

Effectively preserving and encoding structure features from objects in irregular and sparse LiDAR points is a key challenge to 3D object detection on point cloud.

3D Object Detection Autonomous Driving +2

VTNet: Visual Transformer Network for Object Goal Navigation

no code implementations ICLR 2021 Heming Du, Xin Yu, Liang Zheng

In this paper, we introduce a Visual Transformer Network (VTNet) for learning informative visual representation in navigation.

Object

Learning Object Relation Graph and Tentative Policy for Visual Navigation

1 code implementation ECCV 2020 Heming Du, Xin Yu, Liang Zheng

Aiming to improve these two components, this paper proposes three complementary techniques, object relation graph (ORG), trial-driven imitation learning (IL), and a memory-augmented tentative policy network (TPN).

Imitation Learning Relation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.