Search Results for author: Yabiao Wang

Found 77 papers, 43 papers with code

ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection

1 code implementation5 Jun 2024 Jiangning Zhang, Haoyang He, Zhenye Gan, Qingdong He, Yuxuan Cai, Zhucun Xue, Yabiao Wang, Chengjie Wang, Lei Xie, Yong liu

This paper addresses this issue by proposing a comprehensive visual anomaly detection benchmark, \textbf{\textit{ADer}}, which is a modular framework that is highly extensible for new methods.

Anomaly Detection Lesion Detection

AdapNet: Adaptive Noise-Based Network for Low-Quality Image Retrieval

no code implementations28 May 2024 Sihe Zhang, Qingdong He, Jinlong Peng, Yuxi Li, Zhengkai Jiang, Jiafu Wu, Mingmin Chi, Yabiao Wang, Chengjie Wang

To mitigate this issue, we introduce a novel setting for low-quality image retrieval, and propose an Adaptive Noise-Based Network (AdapNet) to learn robust abstract representations.

Image Retrieval Re-Ranking +1

VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation

no code implementations28 May 2024 Qilin Wang, Zhengkai Jiang, Chengming Xu, Jiangning Zhang, Yabiao Wang, Xinyi Zhang, Yun Cao, Weijian Cao, Chengjie Wang, Yanwei Fu

This enables accurate alignment of pose and shape in the generated videos, providing a robust framework capable of handling a wide range of body shapes and dynamic hand movements.

Image Animation

PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud Learning

no code implementations24 May 2024 Qingdong He, Jiangning Zhang, Jinlong Peng, Haoyang He, Yabiao Wang, Chengjie Wang

Transformers have revolutionized the point cloud learning task, but the quadratic complexity hinders its extension to long sequence and makes a burden on limited computational resources.

Efficient Multimodal Large Language Models: A Survey

1 code implementation17 May 2024 Yizhang Jin, Jian Li, Yexin Liu, Tianjun Gu, Kai Wu, Zhengkai Jiang, Muyang He, Bo Zhao, Xin Tan, Zhenye Gan, Yabiao Wang, Chengjie Wang, Lizhuang Ma

In the past year, Multimodal Large Language Models (MLLMs) have demonstrated remarkable performance in tasks such as visual question answering, visual understanding and reasoning.

Edge-computing Question Answering +1

MotionMaster: Training-free Camera Motion Transfer For Video Generation

no code implementations24 Apr 2024 Teng Hu, Jiangning Zhang, Ran Yi, Yating Wang, Hongrui Huang, Jieyu Weng, Yabiao Wang, Lizhuang Ma

Furthermore, we propose a few-shot camera motion disentanglement method to extract the common camera motion from multiple videos with similar camera motions, which employs a window-based clustering technique to extract the common features in temporal attention maps of multiple videos.

Disentanglement Motion Disentanglement +2

Single-temporal Supervised Remote Change Detection for Domain Generalization

no code implementations17 Apr 2024 Qiangang Du, Jinlong Peng, Xu Chen, Qingdong He, Liren He, Qiang Nie, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wang

In this paper, we propose a multimodal contrastive learning (ChangeCLIP) based on visual-language pre-training for change detection domain generalization.

Change Detection Contrastive Learning +1

DMAD: Dual Memory Bank for Real-World Anomaly Detection

no code implementations19 Mar 2024 Jianlong Hu, Xu Chen, Zhenye Gan, Jinlong Peng, Shengchuan Zhang, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Liujuan Cao, Rongrong Ji

To address the challenge of real-world anomaly detection, we propose a new framework named Dual Memory bank enhanced representation learning for Anomaly Detection (DMAD).

Anomaly Detection Representation Learning

Learning Unified Reference Representation for Unsupervised Multi-class Anomaly Detection

no code implementations18 Mar 2024 Liren He, Zhengkai Jiang, Jinlong Peng, Liang Liu, Qiangang Du, Xiaobin Hu, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wang

In the field of multi-class anomaly detection, reconstruction-based methods derived from single-class anomaly detection face the well-known challenge of ``learning shortcuts'', wherein the model fails to learn the patterns of normal samples as it should, opting instead for shortcuts such as identity mapping or artificial noise elimination.

Anomaly Detection

Dual-path Frequency Discriminators for Few-shot Anomaly Detection

no code implementations7 Mar 2024 Yuhu Bai, Jiangning Zhang, Yuhang Dong, Guanzhong Tian, Liang Liu, Yunkang Cao, Yabiao Wang, Chengjie Wang

We consider anomaly detection as a discriminative classification problem, wherefore the dual-path feature discrimination module is employed to detect and locate the image-level and feature-level anomalies in the feature space.

Anomaly Detection

Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection

no code implementations6 Jan 2024 Yuanpeng Tu, Boshen Zhang, Liang Liu, Yuxi Li, Xuhai Chen, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Cai Rong Zhao

Industrial anomaly detection is generally addressed as an unsupervised task that aims at locating defects with only normal training samples.

Anomaly Detection

Density Matters: Improved Core-set for Active Domain Adaptive Segmentation

no code implementations15 Dec 2023 Shizhan Liu, Zhengkai Jiang, Yuxi Li, Jinlong Peng, Yabiao Wang, Weiyao Lin

Active domain adaptation has emerged as a solution to balance the expensive annotation cost and the performance of trained models in semantic segmentation.

Domain Adaptation Semantic Segmentation

AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

1 code implementation10 Dec 2023 Teng Hu, Jiangning Zhang, Ran Yi, Yuzhen Du, Xu Chen, Liang Liu, Yabiao Wang, Chengjie Wang

Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data.

Image Generation

GPT-4V-AD: Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly Detection

1 code implementation5 Nov 2023 Jiangning Zhang, Haoyang He, Xuhai Chen, Zhucun Xue, Yabiao Wang, Chengjie Wang, Lei Xie, Yong liu

Large Multimodal Model (LMM) GPT-4V(ision) endows GPT-4 with visual grounding capabilities, making it possible to handle certain tasks through the Visual Question Answering (VQA) paradigm.

Anomaly Detection Question Answering +3

Stroke-based Neural Painting and Stylization with Dynamically Predicted Painting Region

2 code implementations7 Sep 2023 Teng Hu, Ran Yi, Haokun Zhu, Liang Liu, Jinlong Peng, Yabiao Wang, Chengjie Wang, Lizhuang Ma

To solve the problem, we propose Compositional Neural Painter, a novel stroke-based rendering framework which dynamically predicts the next painting region based on the current canvas, instead of dividing the image plane uniformly into painting regions.

Style Transfer

Toward High Quality Facial Representation Learning

1 code implementation7 Sep 2023 Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Liang Liu, Yabiao Wang, Chengjie Wang

To improve the facial representation quality, we use feature map of a pre-trained visual backbone as a supervision item and use a partially pre-trained decoder for mask image modeling.

Contrastive Learning Decoder +3

Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption

1 code implementation ICCV 2023 Teng Hu, Jiangning Zhang, Liang Liu, Ran Yi, Siqi Kou, Haokun Zhu, Xu Chen, Yabiao Wang, Chengjie Wang, Lizhuang Ma

To address these problems, we propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss, which targets different learning objectives at distinct training stages of the diffusion model.

Domain Adaptation

PVG: Progressive Vision Graph for Vision Recognition

no code implementations1 Aug 2023 Jiafu Wu, Jian Li, Jiangning Zhang, Boshen Zhang, Mingmin Chi, Yabiao Wang, Chengjie Wang

Convolution-based and Transformer-based vision backbone networks process images into the grid or sequence structures, respectively, which are inflexible for capturing irregular objects.

graph construction

RFENet: Towards Reciprocal Feature Evolution for Glass Segmentation

1 code implementation12 Jul 2023 Ke Fan, Changan Wang, Yabiao Wang, Chengjie Wang, Ran Yi, Lizhuang Ma

Glass-like objects are widespread in daily life but remain intractable to be segmented for most existing methods.

Semantic Segmentation

Align, Perturb and Decouple: Toward Better Leverage of Difference Information for RSI Change Detection

1 code implementation30 May 2023 Supeng Wang, Yuxi Li, Ming Xie, Mingmin Chi, Yabiao Wang, Chengjie Wang, Wenbing Zhu

In this paper, we revisit the importance of feature difference for change detection in RSI, and propose a series of operations to fully exploit the difference information: Alignment, Perturbation and Decoupling (APD).

Change Detection Decoder

Dual Path Transformer with Partition Attention

no code implementations24 May 2023 Zhengkai Jiang, Liang Liu, Jiangning Zhang, Yabiao Wang, Mingang Chen, Chengjie Wang

This paper introduces a novel attention mechanism, called dual attention, which is both efficient and effective.

Image Classification object-detection +2

Learning Global-aware Kernel for Image Harmonization

no code implementations ICCV 2023 Xintian Shen, Jiangning Zhang, Jun Chen, Shipeng Bai, Yue Han, Yabiao Wang, Chengjie Wang, Yong liu

To address this issue, we propose a novel Global-aware Kernel Network (GKNet) to harmonize local regions with comprehensive consideration of long-distance background references.

Image Harmonization

Transavs: End-To-End Audio-Visual Segmentation With Transformer

no code implementations12 May 2023 Yuhang Ling, Yuxi Li, Zhenye Gan, Jiangning Zhang, Mingmin Chi, Yabiao Wang

Generally AVS faces two key challenges: (1) Audio signals inherently exhibit a high degree of information density, as sounds produced by multiple objects are entangled within the same audio stream; (2) Objects of the same category tend to produce similar audio signals, making it difficult to distinguish between them and thus leading to unclear segmentation results.

Scene Understanding Segmentation +1

Calibrated Teacher for Sparsely Annotated Object Detection

1 code implementation14 Mar 2023 Haohan Wang, Liang Liu, Boshen Zhang, Jiangning Zhang, Wuhao Zhang, Zhenye Gan, Yabiao Wang, Chengjie Wang, Haoqian Wang

Recent works on sparsely annotated object detection alleviate this problem by generating pseudo labels for the missing annotations.

Object object-detection +2

Multimodal Industrial Anomaly Detection via Hybrid Fusion

1 code implementation CVPR 2023 Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Yabiao Wang, Chengjie Wang

2D-based Industrial Anomaly Detection has been widely discussed, however, multimodal industrial anomaly detection based on 3D point clouds and RGB images still has many untouched fields.

Ranked #3 on RGB+3D Anomaly Detection and Segmentation on MVTEC 3D-AD (using extra training data)

Contrastive Learning RGB+3D Anomaly Detection and Segmentation

Self-Supervised Likelihood Estimation with Energy Guidance for Anomaly Segmentation in Urban Scenes

1 code implementation14 Feb 2023 Yuanpeng Tu, Yuxi Li, Boshen Zhang, Liang Liu, Jiangning Zhang, Yabiao Wang, Cai Rong Zhao

Based on the proposed estimators, we devise an adaptive self-supervised training framework, which exploits the contextual reliance and estimated likelihood to refine mask annotations in anomaly areas.

Anomaly Detection Autonomous Driving

Learning with Noisy labels via Self-supervised Adversarial Noisy Masking

1 code implementation CVPR 2023 Yuanpeng Tu, Boshen Zhang, Yuxi Li, Liang Liu, Jian Li, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Cai Rong Zhao

Collecting large-scale datasets is crucial for training deep models, annotating the data, however, inevitably yields noisy labels, which poses challenges to deep learning algorithms.

Ranked #2 on Image Classification on Clothing1M (using extra training data)

Learning with noisy labels

Exploring Efficient Few-shot Adaptation for Vision Transformers

1 code implementation6 Jan 2023 Chengming Xu, Siqian Yang, Yabiao Wang, Zhanxiong Wang, Yanwei Fu, xiangyang xue

Essentially, despite ViTs have been shown to enjoy comparable or even better performance on other vision tasks, it is still very nontrivial to efficiently finetune the ViTs in real-world FSL scenarios.

Few-Shot Learning

Rethinking Mobile Block for Efficient Attention-based Models

1 code implementation ICCV 2023 Jiangning Zhang, Xiangtai Li, Jian Li, Liang Liu, Zhucun Xue, Boshen Zhang, Zhengkai Jiang, Tianxin Huang, Yabiao Wang, Chengjie Wang

This paper focuses on developing modern, efficient, lightweight models for dense predictions while trading off parameters, FLOPs, and performance.

Unity

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

1 code implementation3 Jan 2023 Yue Han, Jiangning Zhang, Zhucun Xue, Chao Xu, Xintian Shen, Yabiao Wang, Chengjie Wang, Yong liu, Xiangtai Li

In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework.

Benchmarking Few-Shot Object Detection +3

Split-PU: Hardness-aware Training Strategy for Positive-Unlabeled Learning

1 code implementation30 Nov 2022 Chengming Xu, Chen Liu, Siqian Yang, Yabiao Wang, Shijie Zhang, Lijie Jia, Yanwei Fu

Since only part of the most confident positive samples are available and evidence is not enough to categorize the rest samples, many of these unlabeled data may also be the positive samples.

Binary Classification

PatchMix Augmentation to Identify Causal Features in Few-shot Learning

no code implementations29 Nov 2022 Chengming Xu, Chen Liu, Xinwei Sun, Siqian Yang, Yabiao Wang, Chengjie Wang, Yanwei Fu

We theoretically show that such an augmentation mechanism, different from existing ones, is able to identify the causal features.

Data Augmentation Few-Shot Learning +1

Learning from Noisy Labels with Coarse-to-Fine Sample Credibility Modeling

no code implementations23 Aug 2022 Boshen Zhang, Yuxi Li, Yuanpeng Tu, Jinlong Peng, Yabiao Wang, Cunlin Wu, Yang Xiao, Cairong Zhao

Specifically, for the clean set, we deliberately design a memory-based modulation scheme to dynamically adjust the contribution of each sample in terms of its historical credibility sequence during training, thus alleviating the effect from noisy samples incorrectly grouped into the clean set.

Denoising Image Classification

EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm

1 code implementation19 Jun 2022 Jiangning Zhang, Xiangtai Li, Yabiao Wang, Chengjie Wang, Yibo Yang, Yong liu, DaCheng Tao

Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derives that both have consistent mathematical formulation.

Image Classification

FRIH: Fine-grained Region-aware Image Harmonization

no code implementations13 May 2022 Jinlong Peng, Zekun Luo, Liang Liu, Boshen Zhang, Tao Wang, Yabiao Wang, Ying Tai, Chengjie Wang, Weiyao Lin

Image harmonization aims to generate a more realistic appearance of foreground and background for a composite image.

Decoder Image Harmonization

Learning Distinctive Margin toward Active Domain Adaptation

1 code implementation CVPR 2022 Ming Xie, Yuxi Li, Yabiao Wang, Zekun Luo, Zhenye Gan, Zhongyi Sun, Mingmin Chi, Chengjie Wang, Pei Wang

Despite plenty of efforts focusing on improving the domain adaptation ability (DA) under unsupervised or few-shot semi-supervised settings, recently the solution of active learning started to attract more attention due to its suitability in transferring model in a more practical way with limited annotation resource on target data.

Active Learning Domain Adaptation

ASFD: Automatic and Scalable Face Detector

no code implementations26 Jan 2022 Jian Li, Bin Zhang, Yabiao Wang, Ying Tai, Zhenyu Zhang, Chengjie Wang, Jilin Li, Xiaoming Huang, Yili Xia

Along with current multi-scale based detectors, Feature Aggregation and Enhancement (FAE) modules have shown superior performance gains for cutting-edge object detection.

Face Detection object-detection +1

SCSNet: An Efficient Paradigm for Learning Simultaneously Image Colorization and Super-Resolution

no code implementations12 Jan 2022 Jiangning Zhang, Chao Xu, Jian Li, Yue Han, Yabiao Wang, Ying Tai, Yong liu

In the practical application of restoring low-resolution gray-scale images, we generally need to run three separate processes of image colorization, super-resolution, and dows-sampling operation for the target device.

Colorization Image Colorization +1

LCTR: On Awakening the Local Continuity of Transformer for Weakly Supervised Object Localization

no code implementations10 Dec 2021 Zhiwei Chen, Changan Wang, Yabiao Wang, Guannan Jiang, Yunhang Shen, Ying Tai, Chengjie Wang, Wei zhang, Liujuan Cao

In this paper, we propose a novel framework built upon the transformer, termed LCTR (Local Continuity TRansformer), which targets at enhancing the local perception capability of global features among long-range feature dependencies.

Inductive Bias Object +1

Robust Learning with Adaptive Sample Credibility Modeling

no code implementations29 Sep 2021 Boshen Zhang, Yuxi Li, Yuanpeng Tu, Yabiao Wang, Yang Xiao, Cai Rong Zhao, Chengjie Wang

For the clean set, we deliberately design a memory-based modulation scheme to dynamically adjust the contribution of each sample in terms of its historical credibility sequence during training, thus to alleviate the effect from potential hard noisy samples in clean set.

Denoising

Uniformity in Heterogeneity:Diving Deep into Count Interval Partition for Crowd Counting

3 code implementations27 Jul 2021 Changan Wang, Qingyu Song, Boshen Zhang, Yabiao Wang, Ying Tai, Xuyi Hu, Chengjie Wang, Jilin Li, Jiayi Ma, Yang Wu

Therefore, we propose a novel count interval partition criterion called Uniform Error Partition (UEP), which always keeps the expected counting error contributions equal for all intervals to minimize the prediction risk.

Crowd Counting Quantization

NMS-Loss: Learning with Non-Maximum Suppression for Crowded Pedestrian Detection

1 code implementation4 Jun 2021 Zekun Luo, Zheng Fang, Sixiao Zheng, Yabiao Wang, Yanwei Fu

Non-Maximum Suppression (NMS) is essential for object detection and affects the evaluation results by incorporating False Positives (FP) and False Negatives (FN), especially in crowd occlusion scenes.

object-detection Object Detection +1

Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model

1 code implementation NeurIPS 2021 Jiangning Zhang, Chao Xu, Jian Li, Wenzhou Chen, Yabiao Wang, Ying Tai, Shuo Chen, Chengjie Wang, Feiyue Huang, Yong liu

Inspired by biological evolution, we explain the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derive that both of them have consistent mathematical representation.

Image Retrieval Retrieval

SiamRCR: Reciprocal Classification and Regression for Visual Object Tracking

no code implementations24 May 2021 Jinlong Peng, Zhengkai Jiang, Yueyang Gu, Yang Wu, Yabiao Wang, Ying Tai, Chengjie Wang, Weiyao Lin

In addition, we add a localization branch to predict the localization accuracy, so that it can work as the replacement of the regression assistance link during inference.

Classification Object +2

Learning Comprehensive Motion Representation for Action Recognition

no code implementations23 Mar 2021 Mingyu Wu, Boyuan Jiang, Donghao Luo, Junchi Yan, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Xiaokang Yang

For action recognition learning, 2D CNN-based methods are efficient but may yield redundant features due to applying the same 2D convolution kernel to each frame.

Action Recognition

Uniformity in Heterogeneity: Diving Deep Into Count Interval Partition for Crowd Counting

1 code implementation ICCV 2021 Changan Wang, Qingyu Song, Boshen Zhang, Yabiao Wang, Ying Tai, Xuyi Hu, Chengjie Wang, Jilin Li, Jiayi Ma, Yang Wu

Therefore, we propose a novel count interval partition criterion called Uniform Error Partition (UEP), which always keeps the expected counting error contributions equal for all intervals to minimize the prediction risk.

Crowd Counting Quantization

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking

1 code implementation ECCV 2020 Jinlong Peng, Changan Wang, Fangbin Wan, Yang Wu, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yanwei Fu

Existing Multiple-Object Tracking (MOT) methods either follow the tracking-by-detection paradigm to conduct object detection, feature extraction and data association separately, or have two of the three subtasks integrated to form a partially end-to-end solution.

Multiple Object Tracking Object +3

Temporal Distinct Representation Learning for Action Recognition

no code implementations ECCV 2020 Junwu Weng, Donghao Luo, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Xudong Jiang, Junsong Yuan

Motivated by the previous success of Two-Dimensional Convolutional Neural Network (2D CNN) on image recognition, researchers endeavor to leverage it to characterize videos.

Action Recognition Representation Learning

ACFD: Asymmetric Cartoon Face Detector

2 code implementations2 Jul 2020 Bin Zhang, Jian Li, Yabiao Wang, Zhipeng Cui, Yili Xia, Chengjie Wang, Jilin Li, Feiyue Huang

Cartoon face detection is a more challenging task than human face detection due to many difficult scenarios is involved.

Binary Classification Face Detection

ASFD: Automatic and Scalable Face Detector

no code implementations25 Mar 2020 Bin Zhang, Jian Li, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yili Xia, Wenjiang Pei, Rongrong Ji

In this paper, we propose a novel Automatic and Scalable Face Detector (ASFD), which is based on a combination of neural architecture search techniques as well as a new loss design.

Neural Architecture Search

TEINet: Towards an Efficient Architecture for Video Recognition

no code implementations21 Nov 2019 Zhao-Yang Liu, Donghao Luo, Yabiao Wang, Li-Min Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Tong Lu

To relieve this problem, we propose an efficient temporal module, termed as Temporal Enhancement-and-Interaction (TEI Module), which could be plugged into the existing 2D CNNs (denoted by TEINet).

Action Recognition Video Recognition

Fast Learning of Temporal Action Proposal via Dense Boundary Generator

3 code implementations11 Nov 2019 Chuming Lin, Jian Li, Yabiao Wang, Ying Tai, Donghao Luo, Zhipeng Cui, Chengjie Wang, Jilin Li, Feiyue Huang, Rongrong Ji

In this paper, we propose an efficient and unified framework to generate temporal action proposals named Dense Boundary Generator (DBG), which draws inspiration from boundary-sensitive methods and implements boundary classification and action completeness regression for densely distributed proposals.

General Classification Optical Flow Estimation +2

DSFD: Dual Shot Face Detector

4 code implementations CVPR 2019 Jian Li, Yabiao Wang, Changan Wang, Ying Tai, Jianjun Qian, Jian Yang, Chengjie Wang, Jilin Li, Feiyue Huang

In this paper, we propose a novel face detection network with three novel contributions that address three key aspects of face detection, including better feature learning, progressive loss design and anchor assign based data augmentation, respectively.

Data Augmentation Occluded Face Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.