Search Results for author: Chenggang Yan

Found 40 papers, 23 papers with code

Rˆ3Net:Relation-embedded Representation Reconstruction Network for Change Captioning

1 code implementation • EMNLP 2021 • Yunbin Tu, Liang Li, Chenggang Yan, Shengxiang Gao, Zhengtao Yu

In this paper, we propose a Relation-embedded Representation Reconstruction Network (Rˆ3Net) to explicitly distinguish the real change from the large amount of clutter and irrelevant changes.

Caption Generation Relation +1

Paper
Code

Semantic Relation-aware Difference Representation Learning for Change Captioning

1 code implementation • Findings (ACL) 2021 • Yunbin Tu, Tingting Yao, Liang Li, Jiedong Lou, Shengxiang Gao, Zhengtao Yu, Chenggang Yan

Relation Representation Learning

Paper
Code

Context-aware Difference Distilling for Multi-change Captioning

no code implementations • 31 May 2024 • Yunbin Tu, Liang Li, Li Su, Zheng-Jun Zha, Chenggang Yan, Qingming Huang

Given an image pair, CARD first decouples context features that aggregate all similar/dissimilar semantics, termed common/difference context features.

Decoder

Paper
Add Code

Progressive Depth Decoupling and Modulating for Flexible Depth Completion

no code implementations • 15 May 2024 • Zhiwen Yang, Jiehua Zhang, Liang Li, Chenggang Yan, Yaoqi Sun, Haibing Yin

However, previous depth discretization methods are easy to be impacted by depth distribution variations across different scenes, resulting in suboptimal scene depth distribution priors.

Depth Completion

Paper
Add Code

Quality-aware Selective Fusion Network for V-D-T Salient Object Detection

1 code implementation • 13 May 2024 • Liuxin Bao, Xiaofei Zhou, Xiankai Lu, Yaoqi Sun, Haibing Yin, Zhenghui Hu, Jiyong Zhang, Chenggang Yan

Therefore, we propose a quality-aware selective fusion network (QSF-Net) to conduct VDT salient object detection, which contains three subnets including the initial feature extraction subnet, the quality-aware region selection subnet, and the region-guided selective fusion subnet.

object-detection Object Detection +2

Paper
Code

Benchmarking the Cell Image Segmentation Models Robustness under the Microscope Optical Aberrations

no code implementations • 12 Apr 2024 • Boyuan Peng, Jiaju Chen, Qihui Ye, Minjiang Chen, Peiwu Qin, Chenggang Yan, Dongmei Yu, Zhenglin Chen

Overall, this research aims to guide researchers in effectively utilizing cell segmentation models in the presence of minor optical aberrations.

Benchmarking Cell Segmentation +4

Paper
Add Code

SDPL: Shifting-Dense Partition Learning for UAV-View Geo-Localization

no code implementations • 7 Mar 2024 • Quan Chen, Tingyu Wang, Zihao Yang, Haoran Li, Rongfeng Lu, Yaoqi Sun, Bolun Zheng, Chenggang Yan

Cross-view geo-localization aims to match images of the same target from different platforms, e. g., drone and satellite.

Part-based Representation Learning

Paper
Add Code

Harnessing Intra-group Variations Via a Population-Level Context for Pathology Detection

no code implementations • 4 Mar 2024 • P. Bilha Githinji, Xi Yuan, Zhenglin Chen, Ijaz Gul, Dingqi Shang, Wen Liang, Jianming Deng, Dan Zeng, Dongmei Yu, Chenggang Yan, Peiwu Qin

Realizing sufficient separability between the distributions of healthy and pathological samples is a critical obstacle for pathology detection convolutional models.

Paper
Add Code

StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing

no code implementations • 20 Feb 2024 • Gaoxiang Cong, Yuankai Qi, Liang Li, Amin Beheshti, Zhedong Zhang, Anton Van Den Hengel, Ming-Hsuan Yang, Chenggang Yan, Qingming Huang

It contains three main components: (1) A multimodal style adaptor operating at the phoneme level to learn pronunciation style from the reference audio, and generate intermediate representations informed by the facial emotion presented in the video; (2) An utterance-level style learning module, which guides both the mel-spectrogram decoding and the refining processes from the intermediate embeddings to improve the overall style expression; And (3) a phoneme-guided lip aligner to maintain lip sync.

Voice Cloning

Paper
Add Code

Coupled Confusion Correction: Learning from Crowds with Sparse Annotations

2 code implementations • 12 Dec 2023 • Hansong Zhang, Shikun Li, Dan Zeng, Chenggang Yan, Shiming Ge

Moreover, we cluster the ``annotator groups'' who share similar expertise so that their confusion matrices could be corrected together.

Paper
Code

DEFN: Dual-Encoder Fourier Group Harmonics Network for Three-Dimensional Macular Hole Reconstruction with Stochastic Retinal Defect Augmentation and Dynamic Weight Composition

1 code implementation • 1 Nov 2023 • Xingru Huang, Yihao Guo, Jian Huang, Zhi Li, Tianyun Zhang, Kunyan Cai, Gaopeng Huang, WenHao Chen, Zhaoyang Xu, Liangqiong Qu, Ji Hu, Tinyu Wang, Shaowei Jiang, Chenggang Yan, Yaoqi Sun, Xin Ye, Yaqi Wang

Macular hole diagnosis and treatment rely heavily on spatial and quantitative data, yet the scarcity of such data has impeded the progress of deep learning techniques for effective segmentation and real-time 3D reconstruction.

3D Reconstruction Data Augmentation +2

Paper
Code

PC-bzip2: a phase-space continuity enhanced lossless compression algorithm for light field microscopy data

no code implementations • 14 Oct 2023 • Changqing Su, Zihan Lin, You Zhou, Shuai Wang, Yuhan Gao, Chenggang Yan, Bo Xiong

Moreover, by introducing the temporal continuity, our method shows the superior compression ratio on time series data of zebrafish blood vessels.

Paper
Add Code

Self-supervised Cross-view Representation Reconstruction for Change Captioning

1 code implementation • ICCV 2023 • Yunbin Tu, Liang Li, Li Su, Zheng-Jun Zha, Chenggang Yan, Qingming Huang

Change captioning aims to describe the difference between a pair of similar images.

Caption Generation Hallucination

Paper
Code

RawHDR: High Dynamic Range Image Reconstruction from a Single Raw Image

1 code implementation • ICCV 2023 • Yunhao Zou, Chenggang Yan, Ying Fu

Unlike existing methods, the core idea of this work is to incorporate more informative Raw sensor data to generate HDR images, aiming to recover scene information in hard regions (the darkest and brightest areas of an HDR scene).

HDR Reconstruction Image Reconstruction

Paper
Code

Parsing is All You Need for Accurate Gait Recognition in the Wild

1 code implementation • 31 Aug 2023 • Jinkai Zheng, Xinchen Liu, Shuai Wang, Lihao Wang, Chenggang Yan, Wu Liu

Furthermore, due to the lack of suitable datasets, we build the first parsing-based dataset for gait recognition in the wild, named Gait3D-Parsing, by extending the large-scale and challenging Gait3D dataset.

Gait Recognition in the Wild Human Parsing

126

Paper
Code

Rethinking Boundary Discontinuity Problem for Oriented Object Detection

1 code implementation • CVPR 2024 • Hang Xu, Xinyuan Liu, Haonan Xu, Yike Ma, Zunjie Zhu, Chenggang Yan, Feng Dai

We decouple reversibility and joint-optim from single smoothing function into two distinct entities, which for the first time achieves the objectives of both correcting angular boundary and blending angle with other parameters. Extensive experiments on multiple datasets show that boundary discontinuity problem is well-addressed.

Object object-detection +2

Paper
Code

Hybrid Spectral Denoising Transformer with Guided Attention

1 code implementation • ICCV 2023 • Zeqiang Lai, Chenggang Yan, Ying Fu

Challenges in adapting transformer for HSI arise from the capabilities to tackle existing limitations of CNN-based methods in capturing the global and local spatial-spectral correlations while maintaining efficiency and flexibility.

Ranked #1 on Hyperspectral Image Denoising on ICVL-HSI-Gaussian-Blind

Hyperspectral Image Denoising Image Denoising

Paper
Code

Iterative Denoiser and Noise Estimator for Self-Supervised Image Denoising

no code implementations • ICCV 2023 • Yunhao Zou, Chenggang Yan, Ying Fu

However, the unavailable noise prior and inefficient feature extraction take these methods away from high practicality and precision.

Image Denoising

Paper
Add Code

Gaussian Label Distribution Learning for Spherical Image Object Detection

no code implementations • CVPR 2023 • Hang Xu, Xinyuan Liu, Qiang Zhao, Yike Ma, Chenggang Yan, Feng Dai

Therefore, we propose GLDL-ATSS as a better training sample selection strategy for objects of the spherical image, which can alleviate the drawback of IoU threshold-based strategy of scale-sample imbalance.

Object object-detection +2

Paper
Add Code

ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting

1 code implementation • 19 Nov 2022 • Shancheng Fang, Zhendong Mao, Hongtao Xie, Yuxin Wang, Chenggang Yan, Yongdong Zhang

In this paper, we argue that the limited capacity of language models comes from 1) implicit language modeling; 2) unidirectional feature representation; and 3) language model with noise input.

Ranked #4 on Text Spotting on SCUT-CTW1500

Blocking Language Modelling +2

Paper
Code

Learning Cross-view Geo-localization Embeddings via Dynamic Weighted Decorrelation Regularization

no code implementations • 10 Nov 2022 • Tingyu Wang, Zhedong Zheng, Zunjie Zhu, Yuhan Gao, Yi Yang, Chenggang Yan

Cross-view geo-localization aims to spot images of the same location shot from two platforms, e. g., the drone platform and the satellite platform.

Paper
Add Code

Gait Recognition in the Wild with Multi-hop Temporal Switch

1 code implementation • 1 Sep 2022 • Jinkai Zheng, Xinchen Liu, Xiaoyan Gu, Yaoqi Sun, Chuang Gan, Jiyong Zhang, Wu Liu, Chenggang Yan

Current methods that obtain state-of-the-art performance on in-the-lab benchmarks achieve much worse accuracy on the recently proposed in-the-wild datasets because these methods can hardly model the varied temporal dynamics of gait sequences in unconstrained scenes.

Gait Recognition in the Wild

126

Paper
Code

Multi-task Optimization Based Co-training for Electricity Consumption Prediction

no code implementations • 31 May 2022 • Hui Song, A. K. Qin, Chenggang Yan

The performance of MTO-CT is evaluated on solving each of these two sets of tasks in comparison to solving each task in the set independently without knowledge sharing under the same settings, which demonstrates the superiority of MTO-CT in terms of prediction accuracy.

Transfer Learning

Paper
Add Code

Mixed-UNet: Refined Class Activation Mapping for Weakly-Supervised Semantic Segmentation with Multi-scale Inference

no code implementations • 6 May 2022 • Yang Liu, Ersi Zhang, Lulu Xu, Chufan Xiao, Xiaoyun Zhong, Lijin Lian, Fang Li, Bin Jiang, Yuhan Dong, Lan Ma, Qiming Huang, Ming Xu, Yongbing Zhang, Dongmei Yu, Chenggang Yan, Peiwu Qin

Deep learning techniques have shown great potential in medical image processing, particularly through accurate and reliable image segmentation on magnetic resonance imaging (MRI) scans or computed tomography (CT) scans, which allow the localization and diagnosis of lesions.

Computed Tomography (CT) Image Segmentation +3

Paper
Add Code

Multiple-environment Self-adaptive Network for Aerial-view Geo-localization

2 code implementations • 18 Apr 2022 • Tingyu Wang, Zhedong Zheng, Yaoqi Sun, Chenggang Yan, Yi Yang, Tat-Seng Chua

This task is mostly regarded as an image retrieval problem.

Image Retrieval Retrieval

438

Paper
Code

Gait Recognition in the Wild with Dense 3D Representations and A Benchmark

1 code implementation • CVPR 2022 • Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, Tao Mei

Based on Gait3D, we comprehensively compare our method with existing gait recognition approaches, which reflects the superior performance of our framework and the potential of 3D representations for gait recognition in the wild.

Ranked #1 on Gait Recognition on Gait3D

Gait Recognition in the Wild

126

Paper
Code

R$^3$Net:Relation-embedded Representation Reconstruction Network for Change Captioning

1 code implementation • 20 Oct 2021 • Yunbin Tu, Liang Li, Chenggang Yan, Shengxiang Gao, Zhengtao Yu

In this paper, we propose a Relation-embedded Representation Reconstruction Network (R$^3$Net) to explicitly distinguish the real change from the large amount of clutter and irrelevant changes.

Caption Generation Relation +1

Paper
Code

Unbiased IoU for Spherical Image Object Detection

no code implementations • 18 Aug 2021 • Qiang Zhao, Bin Chen, Hang Xu, Yike Ma, XiaoDong Li, Bailan Feng, Chenggang Yan, Feng Dai

In this paper, we first identify that spherical rectangles are unbiased bounding boxes for objects in spherical images, and then propose an analytical method for IoU calculation without any approximations.

Object object-detection +1

Paper
Add Code

TraND: Transferable Neighborhood Discovery for Unsupervised Cross-domain Gait Recognition

1 code implementation • 9 Feb 2021 • Jinkai Zheng, Xinchen Liu, Chenggang Yan, Jiyong Zhang, Wu Liu, XiaoPing Zhang, Tao Mei

Despite significant improvement in gait recognition with deep learning, existing studies still neglect a more practical but challenging scenario -- unsupervised cross-domain gait recognition which aims to learn a model on a labeled dataset then adapts it to an unlabeled dataset.

Gait Recognition

Paper
Code

Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans

2 code implementations • 14 Jan 2021 • Xin He, Shihao Wang, Xiaowen Chu, Shaohuai Shi, Jiangping Tang, Xin Liu, Chenggang Yan, Jiyong Zhang, Guiguang Ding

The experimental results show that our automatically searched models (CovidNet3D) outperform the baseline human-designed models on the three datasets with tens of times smaller model size and higher accuracy.

Benchmarking Medical Diagnosis +1

Paper
Code

Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization

1 code implementation • 26 Aug 2020 • Tingyu Wang, Zhedong Zheng, Chenggang Yan, Jiyong Zhang, Yaoqi Sun, Bolun Zheng, Yi Yang

Existing methods usually concentrate on mining the fine-grained feature of the geographic target in the image center, but underestimate the contextual information in neighbor areas.

Ranked #3 on Drone navigation on University-1652

Drone navigation Drone-view target localization +2

Paper
Code

Depth image denoising using nuclear norm and learning graph model

no code implementations • 9 Aug 2020 • Chenggang Yan, Zhisheng Li, Yongbing Zhang, Yutao Liu, Xiangyang Ji, Yongdong Zhang

The depth images denoising are increasingly becoming the hot research topic nowadays because they reflect the three-dimensional (3D) scene and can be applied in various fields of computer vision.

Image Denoising Image Restoration

Paper
Add Code

Unsupervised Person Re-identification via Softened Similarity Learning

1 code implementation • CVPR 2020 • Yutian Lin, Lingxi Xie, Yu Wu, Chenggang Yan, Qi Tian

Person re-identification (re-ID) is an important topic in computer vision.

Clustering General Classification +2

Paper
Code

Deep Multi-View Enhancement Hashing for Image Retrieval

no code implementations • 1 Feb 2020 • Chenggang Yan, Biao Gong, Yuxuan Wei, Yue Gao

Therefore, we try to introduce the multi-view deep neural network into the hash learning field, and design an efficient and innovative retrieval model, which has achieved a significant improvement in retrieval performance.

Image Retrieval Retrieval

Paper
Add Code

Cascaded Revision Network for Novel Object Captioning

1 code implementation • 6 Aug 2019 • Qianyu Feng, Yu Wu, Hehe Fan, Chenggang Yan, Yi Yang

By this novel cascaded captioning-revising mechanism, CRN can accurately describe images with unseen objects.

Image Captioning Object +3

Paper
Code

Approximated Oracle Filter Pruning for Destructive CNN Width Optimization

1 code implementation • 12 May 2019 • Xiaohan Ding, Guiguang Ding, Yuchen Guo, Jungong Han, Chenggang Yan

It is not easy to design and run Convolutional Neural Networks (CNNs) due to: 1) finding the optimal number of filters (i. e., the width) at each layer is tricky, given an architecture; and 2) the computational intensity of CNNs impedes the deployment on computationally limited devices.

Paper
Code

Image Classification base on PCA of Multi-view Deep Representation

no code implementations • 12 Mar 2019 • Yaoqi Sun, Liang Li, Liang Zheng, Ji Hu, Yatong Jiang, Chenggang Yan

In the age of information explosion, image classification is the key technology of dealing with and organizing a large number of image data.

Classification General Classification +1

Paper
Add Code

Asymptotic Soft Filter Pruning for Deep Convolutional Neural Networks

2 code implementations • 22 Aug 2018 • Yang He, Xuanyi Dong, Guoliang Kang, Yanwei Fu, Chenggang Yan, Yi Yang

With asymptotic pruning, the information of the training set would be gradually concentrated in the remaining filters, so the subsequent training and pruning process would be stable.

Image Classification

Paper
Code

Memory Matching Networks for One-Shot Image Recognition

no code implementations • CVPR 2018 • Qi Cai, Yingwei Pan, Ting Yao, Chenggang Yan, Tao Mei

In this paper, we introduce the new ideas of augmenting Convolutional Neural Networks (CNNs) with Memory and learning to learn the network parameters for the unlabelled images on the fly in one-shot learning.

One-Shot Learning Philosophy

Paper
Add Code

Improving Person Re-identification by Attribute and Identity Learning

2 code implementations • 21 Mar 2017 • Yutian Lin, Liang Zheng, Zhedong Zheng, Yu Wu, Zhilan Hu, Chenggang Yan, Yi Yang

Person re-identification (re-ID) and attribute recognition share a common target at learning pedestrian descriptions.

Ranked #75 on Person Re-Identification on DukeMTMC-reID

Attribute Person Recognition +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.