Search Results for author: Dapeng Tao

Found 25 papers, 12 papers with code

Dual-task Mutual Reinforcing Embedded Joint Video Paragraph Retrieval and Grounding

1 code implementation26 Nov 2024 Mengzhao Wang, Huafeng Li, Yafei Zhang, Jinxing Li, Minghong Xie, Dapeng Tao

The retrieval branch uses inter-video contrastive learning to roughly align the global features of paragraphs and videos, reducing modality differences and constructing a coarse-grained feature space to break free from the need for correspondence between paragraphs and videos.

Contrastive Learning Retrieval

Phrase Decoupling Cross-Modal Hierarchical Matching and Progressive Position Correction for Visual Grounding

1 code implementation31 Oct 2024 Minghong Xie, Mengzhao Wang, Huafeng Li, Yafei Zhang, Dapeng Tao, Zhengtao Yu

In addition, a corresponding target object position progressive correction strategy is defined based on the hierarchical matching mechanism to achieve accurate positioning for the target object described in the text.

Object Position +2

Hi-GMAE: Hierarchical Graph Masked Autoencoders

1 code implementation17 May 2024 Chuang Liu, Zelin Yao, Yibing Zhan, Xueqi Ma, Dapeng Tao, Jia Wu, Wenbin Hu, Shirui Pan, Bo Du

To ensure masking uniformity of subgraphs across these scales, we propose a novel coarse-to-fine strategy that initiates masking at the coarsest scale and progressively back-projects the mask to the finer scales.

Graph Neural Network Self-Supervised Learning

Where to Mask: Structure-Guided Masking for Graph Masked Autoencoders

1 code implementation24 Apr 2024 Chuang Liu, Yuyao Wang, Yibing Zhan, Xueqi Ma, Dapeng Tao, Jia Wu, Wenbin Hu

To this end, we introduce a novel structure-guided masking strategy (i. e., StructMAE), designed to refine the existing GMAE models.

Transfer Learning

Single-Image HDR Reconstruction Assisted Ghost Suppression and Detail Preservation Network for Multi-Exposure HDR Imaging

1 code implementation7 Mar 2024 Huafeng Li, Zhenmei Yang, Yafei Zhang, Dapeng Tao, Zhengtao Yu

This network, comprising single-frame HDR reconstruction with enhanced stop image (SHDR-ESI) and SHDR-ESI-assisted multi-exposure HDR reconstruction (SHDRA-MHDR), effectively leverages the ghost-free characteristic of single-frame HDR reconstruction and the detail-enhancing capability of ESI in oversaturated areas.

HDR Reconstruction Image Reconstruction

Large Language Models as an Indirect Reasoner: Contrapositive and Contradiction for Automated Reasoning

no code implementations6 Feb 2024 Yanfang Zhang, Yiliu Sun, Yibing Zhan, Dapeng Tao, DaCheng Tao, Chen Gong

The experimental results on popular LLMs, such as GPT-3. 5-turbo and Gemini-pro, show that our IR method enhances the overall accuracy of factual reasoning by 27. 33% and mathematical proof by 31. 43%, when compared with traditional DR methods.

CLIP-Driven Semantic Discovery Network for Visible-Infrared Person Re-Identification

1 code implementation11 Jan 2024 Xiaoyan Yu, Neng Dong, Liehuang Zhu, Hao Peng, Dapeng Tao

Additionally, acknowledging the complementary nature of semantic details across different modalities, we integrate text features from the bimodal language descriptions to achieve comprehensive semantics.

Person Re-Identification

Exploring Sparsity in Graph Transformers

no code implementations9 Dec 2023 Chuang Liu, Yibing Zhan, Xueqi Ma, Liang Ding, Dapeng Tao, Jia Wu, Wenbin Hu, Bo Du

Graph Transformers (GTs) have achieved impressive results on various graph-related tasks.

Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld

1 code implementation CVPR 2024 Yijun Yang, Tianyi Zhou, Kanxue Li, Dapeng Tao, Lusong Li, Li Shen, Xiaodong He, Jing Jiang, Yuhui Shi

While large language models (LLMs) excel in a simulated world of texts, they struggle to interact with the more realistic world without perceptions of other modalities such as visual or audio signals.

Imitation Learning

SpliceMix: A Cross-scale and Semantic Blending Augmentation Strategy for Multi-label Image Classification

1 code implementation26 Nov 2023 Lei Wang, Yibing Zhan, Leilei Ma, Dapeng Tao, Liang Ding, Chen Gong

The "splice" in our method is two-fold: 1) Each mixed image is a splice of several downsampled images in the form of a grid, where the semantics of images attending to mixing are blended without object deficiencies for alleviating co-occurred bias; 2) We splice mixed images and the original mini-batch to form a new SpliceMixed mini-batch, which allows an image with different scales to contribute to training together.

Data Augmentation Multi-Label Image Classification

Chasing Consistency in Text-to-3D Generation from a Single Image

no code implementations7 Sep 2023 Yichen Ouyang, Wenhao Chai, Jiayi Ye, Dapeng Tao, Yibing Zhan, Gaoang Wang

In light of the above issues, we present Consist3D, a three-stage framework Chasing for semantic-, geometric-, and saturation-Consistent Text-to-3D generation from a single image, in which the first two stages aim to learn parameterized consistency tokens, and the last stage is for optimization.

3D Generation Text to 3D

Progressive Feature Mining and External Knowledge-Assisted Text-Pedestrian Image Retrieval

no code implementations23 Aug 2023 Huafeng Li, Shedan Yang, Yafei Zhang, Dapeng Tao, Zhengtao Yu

In addition, to further reduce the negative impact of modal discrepancy and text diversity on cross-modal matching, we propose to use other sample knowledge of the same modality, i. e., external knowledge to enhance identity-consistent features and weaken identity-inconsistent features.

Diversity Image Retrieval +1

Prototype-Driven and Multi-Expert Integrated Multi-Modal MR Brain Tumor Image Segmentation

1 code implementation22 Jul 2023 Yafei Zhang, Zhiyuan Li, Huafeng Li, Dapeng Tao

To this end, a multi-modal MR brain tumor segmentation method with tumor prototype-driven and multi-expert integration is proposed.

Brain Tumor Segmentation Image Segmentation +2

Free-Form Composition Networks for Egocentric Action Recognition

no code implementations13 Jul 2023 Haoran Wang, Qinghua Cheng, Baosheng Yu, Yibing Zhan, Dapeng Tao, Liang Ding, Haibin Ling

We evaluated our method on three popular egocentric action recognition datasets, Something-Something V2, H2O, and EPIC-KITCHENS-100, and the experimental results demonstrate the effectiveness of the proposed method for handling data scarcity problems, including long-tailed and few-shot egocentric action recognition.

Action Recognition Temporal Action Localization

Adversarial Self-Attack Defense and Spatial-Temporal Relation Mining for Visible-Infrared Video Person Re-Identification

no code implementations8 Jul 2023 Huafeng Li, Le Xu, Yafei Zhang, Dapeng Tao, Zhengtao Yu

In this work, the changes of views, posture, background and modal discrepancy are considered as the main factors that cause the perturbations of person identity features.

Adversarial Attack Video-Based Person Re-Identification

Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks

no code implementations18 Jul 2022 Chuang Liu, Xueqi Ma, Yibing Zhan, Liang Ding, Dapeng Tao, Bo Du, Wenbin Hu, Danilo Mandic

However, the LTH-based methods suffer from two major drawbacks: 1) they require exhaustive and iterative training of dense models, resulting in an extremely large training computation cost, and 2) they only trim graph structures and model parameters but ignore the node feature dimension, where significant redundancy exists.

Node Classification

Learning Domain-invariant Graph for Adaptive Semi-supervised Domain Adaptation with Few Labeled Source Samples

no code implementations21 Aug 2020 Jinfeng Li, Weifeng Liu, Yicong Zhou, Jun Yu, Dapeng Tao

Traditional domain adaptation algorithms assume that enough labeled data, which are treated as the prior knowledge are available in the source domain.

Domain Adaptation Graph Learning +1

Hetero-Center Loss for Cross-Modality Person Re-Identification

no code implementations22 Oct 2019 Yuanxin Zhu, Zhao Yang, Li Wang, Sai Zhao, Xiao Hu, Dapeng Tao

With the joint supervision of Cross-Entropy (CE) loss and HC loss, the network is trained to achieve two vital objectives, inter-class discrepancy and intra-class cross-modality similarity as much as possible.

Cross-Modality Person Re-identification Person Re-Identification

Knowledge Amalgamation from Heterogeneous Networks by Common Feature Learning

2 code implementations24 Jun 2019 Sihui Luo, Xinchao Wang, Gongfan Fang, Yao Hu, Dapeng Tao, Mingli Song

An increasing number of well-trained deep networks have been released online by researchers and developers, enabling the community to reuse them in a plug-and-play way without accessing the training annotations.

Ensemble p-Laplacian Regularization for Remote Sensing Image Recognition

no code implementations21 Jun 2018 Xueqi Ma, Weifeng Liu, Dapeng Tao, Yicong Zhou

Therefore, we develop an ensemble p-Laplacian regularization (EpLapR) to fully approximate the intrinsic manifold of the data distribution.

Anchor-based Nearest Class Mean Loss for Convolutional Neural Networks

no code implementations22 Apr 2018 Fusheng Hao, Jun Cheng, Lei Wang, Xinchao Wang, Jianzhong Cao, Xiping Hu, Dapeng Tao

Discriminative features are obtained by constraining the deep CNNs to map training samples to the corresponding anchors as close as possible.

Image Classification

Multiview Cauchy Estimator Feature Embedding for Depth and Inertial Sensor-Based Human Action Recognition

no code implementations7 Aug 2016 Yanan Guo, Lei LI, Weifeng Liu, Jun Cheng, Dapeng Tao

Since human actions can be characterized by multiple feature representations extracted from Kinect and inertial sensors, multiview features must be encoded into a unified space optimal for human action recognition.

Action Recognition Temporal Action Localization

Cannot find the paper you are looking for? You can Submit a new open access paper.