Search Results for author: Zechao Li

Found 32 papers, 9 papers with code

VRP-SAM: SAM with Visual Reference Prompt

no code implementations27 Feb 2024 Yanpeng Sun, Jiahui Chen, Shan Zhang, Xinyu Zhang, Qiang Chen, Gang Zhang, Errui Ding, Jingdong Wang, Zechao Li

In this paper, we propose a novel Visual Reference Prompt (VRP) encoder that empowers the Segment Anything Model (SAM) to utilize annotated reference images as prompts for segmentation, creating the VRP-SAM model.

Meta-Learning Segmentation

Spatial Structure Constraints for Weakly Supervised Semantic Segmentation

1 code implementation20 Jan 2024 Tao Chen, Yazhou Yao, Xingguo Huang, Zechao Li, Liqiang Nie, Jinhui Tang

In this paper, we propose spatial structure constraints (SSC) for weakly supervised semantic segmentation to alleviate the unwanted object over-activation of attention expansion.

Object Object Localization +2

Context Disentangling and Prototype Inheriting for Robust Visual Grounding

1 code implementation19 Dec 2023 Wei Tang, Liang Li, Xuejing Liu, Lu Jin, Jinhui Tang, Zechao Li

In this paper, we propose a novel framework with context disentangling and prototype inheriting for robust visual grounding to handle both scenes.

Visual Grounding

What Large Language Models Bring to Text-rich VQA?

no code implementations13 Nov 2023 Xuejing Liu, Wei Tang, Xinzhe Ni, Jinghui Lu, Rui Zhao, Zechao Li, Fei Tan

This pipeline achieved superior performance compared to the majority of existing Multimodal Large Language Models (MLLM) on four text-rich VQA datasets.

Image Comprehension Optical Character Recognition (OCR) +2

Learning Contrastive Self-Distillation for Ultra-Fine-Grained Visual Categorization Targeting Limited Samples

no code implementations10 Nov 2023 Ziye Fang, Xin Jiang, Hao Tang, Zechao Li

In the field of intelligent multimedia analysis, ultra-fine-grained visual categorization (Ultra-FGVC) plays a vital role in distinguishing intricate subcategories within broader categories.

Contrastive Learning Fine-Grained Visual Categorization

Delving into Multimodal Prompting for Fine-grained Visual Classification

no code implementations16 Sep 2023 Xin Jiang, Hao Tang, Junyao Gao, Xiaoyu Du, Shengfeng He, Zechao Li

In this paper, we aim to fully exploit the capabilities of cross-modal description to tackle FGVC tasks and propose a novel multimodal prompting solution, denoted as MP-FGVC, based on the contrastive language-image pertaining (CLIP) model.

Classification Fine-Grained Image Classification

DiffusionVMR: Diffusion Model for Video Moment Retrieval

no code implementations29 Aug 2023 Henghao Zhao, Kevin Qinghong Lin, Rui Yan, Zechao Li

Video moment retrieval is a fundamental visual-language task that aims to retrieve target moments from an untrimmed video based on a language query.

Denoising Moment Retrieval +4

M$^3$Net: Multi-view Encoding, Matching, and Fusion for Few-shot Fine-grained Action Recognition

no code implementations6 Aug 2023 Hao Tang, Jun Liu, Shuanglin Yan, Rui Yan, Zechao Li, Jinhui Tang

Due to the scarcity of manually annotated data required for fine-grained video understanding, few-shot fine-grained (FS-FG) action recognition has gained significant attention, with the aim of classifying novel fine-grained action categories with only a few labeled instances.

Decision Making Fine-grained Action Recognition +1

Synthetic Instance Segmentation from Semantic Image Segmentation Masks

1 code implementation2 Aug 2023 Yuchen Shen, Dong Zhang, yuhui Zheng, Zechao Li, Liyong Fu, Qiaolin Ye

SISeg does not require training a semantic or/and instance segmentation model and avoids the need for instance-level image annotations.

Image Segmentation Instance Segmentation +3

Exploring Effective Factors for Improving Visual In-Context Learning

1 code implementation10 Apr 2023 Yanpeng Sun, Qiang Chen, Jian Wang, Jingdong Wang, Zechao Li

By doing this, the model can leverage the diverse knowledge stored in different parts of the model to improve its performance on new tasks.

In-Context Learning Meta-Learning +1

ADPS: Asymmetric Distillation Post-Segmentation for Image Anomaly Detection

no code implementations19 Oct 2022 Peng Xing, Hao Tang, Jinhui Tang, Zechao Li

However, existing KDAD methods suffer from two main limitations: 1) the student network can effortlessly replicate the teacher network's representations, and 2) the features of the teacher network serve solely as a ``reference standard" and are not fully leveraged.

Anomaly Detection Knowledge Distillation

Visual Anomaly Detection Via Partition Memory Bank Module and Error Estimation

no code implementations26 Sep 2022 Peng Xing, Zechao Li

Reconstruction method based on the memory module for visual anomaly detection attempts to narrow the reconstruction error for normal samples while enlarging it for anomalous samples.

Anomaly Detection

Self-Supervised Guided Segmentation Framework for Unsupervised Anomaly Detection

no code implementations26 Sep 2022 Peng Xing, Yanpeng Sun, Zechao Li

In this paper, a novel Self-Supervised Guided Segmentation Framework (SGSF) is proposed by jointly exploring effective generation method of forged anomalous samples and the normal sample features as the guidance information of segmentation for anomaly detection.

Segmentation Unsupervised Anomaly Detection

Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding

1 code implementation18 Jul 2022 Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Zechao Li, Qi Tian, Qingming Huang

Second, most previous weakly supervised REG methods ignore the discriminative location and context of the referent, causing difficulties in distinguishing the target from other same-category objects.

Attribute Referring Expression +2

sub-region localized hashing for fine-grained image retrieval

no code implementations IEEE Transactions on Image Processing 2021 Xinguang Xiang, YaJie Zhang, Lu Jin, Zechao Li, Jinhui Tang

Specifically, to localize diverse local regions, a sub-region localization module is developed to learn discriminative local features by locating the peaks of non-overlap sub-regions in the feature map.

Image Retrieval Retrieval

SSA: Semantic Structure Aware Inference for Weakly Pixel-Wise Dense Predictions without Cost

no code implementations5 Nov 2021 Yanpeng Sun, Zechao Li

The pixel-wise dense prediction tasks based on weakly supervisions currently use Class Attention Maps (CAM) to generate pseudo masks as ground-truth.

Weakly-Supervised Object Localization Weakly supervised Semantic Segmentation +1

CTNet: Context-based Tandem Network for Semantic Segmentation

1 code implementation20 Apr 2021 Zechao Li, Yanpeng Sun, Jinhui Tang

Specifically, the Spatial Contextual Module (SCM) is leveraged to uncover the spatial contextual dependency between pixels by exploring the correlation between pixels and categories.

Segmentation Semantic Segmentation

NuI-Go: Recursive Non-Local Encoder-Decoder Network for Retinal Image Non-Uniform Illumination Removal

no code implementations7 Aug 2020 Chongyi Li, Huazhu Fu, Runmin Cong, Zechao Li, Qianqian Xu

We further demonstrate the advantages of the proposed method for improving the accuracy of retinal vessel segmentation.

Retinal Vessel Segmentation

Data-driven Meta-set Based Fine-Grained Visual Classification

1 code implementation6 Aug 2020 Chuanyi Zhang, Yazhou Yao, Xiangbo Shu, Zechao Li, Zhenmin Tang, Qi Wu

To this end, we propose a data-driven meta-set based approach to deal with noisy web images for fine-grained recognition.

Classification Fine-Grained Image Classification +3

Face Super-Resolution Guided by 3D Facial Priors

1 code implementation ECCV 2020 Xiaobin Hu, Wenqi Ren, John LaMaster, Xiaochun Cao, Xiaoming Li, Zechao Li, Bjoern Menze, Wei Liu

State-of-the-art face super-resolution methods employ deep convolutional neural networks to learn a mapping between low- and high- resolution facial patterns by exploring local appearance knowledge.

Super-Resolution

Deep Semantic Multimodal Hashing Network for Scalable Image-Text and Video-Text Retrievals

no code implementations9 Jan 2019 Lu Jin, Zechao Li, Jinhui Tang

In this article, we propose a novel deep semantic multimodal hashing network (DSMHN) for scalable image-text and video-text retrieval.

Cross-Modal Retrieval Deep Hashing +3

Single Image Dehazing via Conditional Generative Adversarial Network

no code implementations CVPR 2018 Runde Li, Jinshan Pan, Zechao Li, Jinhui Tang

In contrast, we solve this problem based on a conditional generative adversarial network (cGAN), where the clear image is estimated by an end-to-end trainable neural network.

Generative Adversarial Network Image Dehazing +1

Deep Ordinal Hashing with Spatial Attention

no code implementations7 May 2018 Lu Jin, Xiangbo Shu, Kai Li, Zechao Li, Guo-Jun Qi, Jinhui Tang

However, most existing deep hashing methods directly learn the hash functions by encoding the global semantic information, while ignoring the local spatial information of images.

Deep Hashing Image Retrieval

Social Anchor-Unit Graph Regularized Tensor Completion for Large-Scale Image Retagging

no code implementations12 Apr 2018 Jinhui Tang, Xiangbo Shu, Zechao Li, Yu-Gang Jiang, Qi Tian

Recent approaches simultaneously explore visual, user and tag information to improve the performance of image retagging by constructing and exploring an image-tag-user graph.

Graph Learning TAG

Hardware-Efficient Guided Image Filtering For Multi-Label Problem

no code implementations CVPR 2017 Longquan Dai, Mengke Yuan, Zechao Li, Xiaopeng Zhang, Jinhui Tang

In this paper we propose a hardware-efficient Guided Filter (HGF), which solves the efficiency problem of multichannel guided image filtering and yields competent results when applying it to multi-label problems with synthesized polynomial multichannel guidance.

Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification

no code implementations14 Jun 2017 Yu-Gang Jiang, Zuxuan Wu, Jinhui Tang, Zechao Li, xiangyang xue, Shih-Fu Chang

More specifically, we utilize three Convolutional Neural Networks (CNNs) operating on appearance, motion and audio signals to extract their corresponding features.

General Classification Video Classification

Personalized Age Progression with Bi-level Aging Dictionary Learning

no code implementations4 Jun 2017 Xiangbo Shu, Jinhui Tang, Zechao Li, Hanjiang Lai, Liyan Zhang, Shuicheng Yan

Basically, for each age group, we learn an aging dictionary to reveal its aging characteristics (e. g., wrinkles), where the dictionary bases corresponding to the same index yet from two neighboring aging dictionaries form a particular aging pattern cross these two age groups, and a linear combination of all these patterns expresses a particular personalized aging process.

Dictionary Learning Face Verification

Concurrence-Aware Long Short-Term Sub-Memories for Person-Person Action Recognition

no code implementations3 Jun 2017 Xiangbo Shu, Jinhui Tang, Guo-Jun Qi, Yan Song, Zechao Li, Liyan Zhang

To this end, we propose a novel Concurrence-Aware Long Short-Term Sub-Memories (Co-LSTSM) to model the long-term inter-related dynamics between two interacting people on the bounding boxes covering people.

Action Recognition Temporal Action Localization

Weakly-Supervised Dual Clustering for Image Semantic Segmentation

no code implementations CVPR 2013 Yang Liu, Jing Liu, Zechao Li, Jinhui Tang, Hanqing Lu

In this paper, we propose a novel Weakly-Supervised Dual Clustering (WSDC) approach for image semantic segmentation with image-level labels, i. e., collaboratively performing image segmentation and tag alignment with those regions.

Clustering Image Segmentation +4

Cannot find the paper you are looking for? You can Submit a new open access paper.