Search Results for author: Jiangning Zhang

Found 59 papers, 32 papers with code

Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark

1 code implementation16 Apr 2024 Jiangning Zhang, Chengjie Wang, Xiangtai Li, Guanzhong Tian, Zhucun Xue, Yong liu, Guansong Pang, DaCheng Tao

Moreover, current metrics such as AU-ROC have nearly reached saturation on simple datasets, which prevents a comprehensive evaluation of different methods.

Anomaly Detection object-detection +2

Deepfake Generation and Detection: A Benchmark and Survey

1 code implementation26 Mar 2024 Gan Pei, Jiangning Zhang, Menghan Hu, Zhenyu Zhang, Chengjie Wang, Yunsheng Wu, Guangtao Zhai, Jian Yang, Chunhua Shen, DaCheng Tao

Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions, which has significant application potential in fields such as entertainment, movie production, digital human creation, to name a few.

Attribute Face Reenactment +2

DiffFAE: Advancing High-fidelity One-shot Facial Appearance Editing with Space-sensitive Customization and Semantic Preservation

no code implementations26 Mar 2024 Qilin Wang, Jiangning Zhang, Chengming Xu, Weijian Cao, Ying Tai, Yue Han, Yanhao Ge, Hong Gu, Chengjie Wang, Yanwei Fu

Facial Appearance Editing (FAE) aims to modify physical attributes, such as pose, expression and lighting, of human facial images while preserving attributes like identity and background, showing great importance in photograph.

Attribute Semantic Composition

DMAD: Dual Memory Bank for Real-World Anomaly Detection

no code implementations19 Mar 2024 Jianlong Hu, Xu Chen, Zhenye Gan, Jinlong Peng, Shengchuan Zhang, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Liujuan Cao, Rongrong Ji

To address the challenge of real-world anomaly detection, we propose a new framework named Dual Memory bank enhanced representation learning for Anomaly Detection (DMAD).

Anomaly Detection Representation Learning

DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation

no code implementations10 Mar 2024 Xiaobin Hu, Xu Peng, Donghao Luo, Xiaozhong Ji, Jinlong Peng, Zhengkai Jiang, Jiangning Zhang, Taisong Jin, Chengjie Wang, Rongrong Ji

Our DiffuMatting shows several potential applications (e. g., matting-data generator, community-friendly art design and controllable generation).

Image Matting Object

Dual-path Frequency Discriminators for Few-shot Anomaly Detection

no code implementations7 Mar 2024 Yuhu Bai, Jiangning Zhang, Yuhang Dong, Guanzhong Tian, Liang Liu, Yunkang Cao, Yabiao Wang, Chengjie Wang

We consider anomaly detection as a discriminative classification problem, wherefore the dual-path feature discrimination module is employed to detect and locate the image-level and feature-level anomalies in the feature space.

Anomaly Detection

A Survey on Visual Anomaly Detection: Challenge, Approach, and Prospect

no code implementations29 Jan 2024 Yunkang Cao, Xiaohao Xu, Jiangning Zhang, Yuqi Cheng, Xiaonan Huang, Guansong Pang, Weiming Shen

Visual Anomaly Detection (VAD) endeavors to pinpoint deviations from the concept of normality in visual data, widely applied across diverse domains, e. g., industrial defect inspection, and medical lesion detection.

Anomaly Detection Lesion Detection

Towards Language-Driven Video Inpainting via Multimodal Large Language Models

no code implementations18 Jan 2024 Jianzong Wu, Xiangtai Li, Chenyang Si, Shangchen Zhou, Jingkang Yang, Jiangning Zhang, Yining Li, Kai Chen, Yunhai Tong, Ziwei Liu, Chen Change Loy

We introduce a new task -- language-driven video inpainting, which uses natural language instructions to guide the inpainting process.

Video Inpainting

Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection

no code implementations6 Jan 2024 Yuanpeng Tu, Boshen Zhang, Liang Liu, Yuxi Li, Xuhai Chen, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Cai Rong Zhao

Industrial anomaly detection is generally addressed as an unsupervised task that aims at locating defects with only normal training samples.

Anomaly Detection

A Generalist FaceX via Learning Unified Facial Representation

1 code implementation31 Dec 2023 Yue Han, Jiangning Zhang, Junwei Zhu, Xiangtai Li, Yanhao Ge, Wei Li, Chengjie Wang, Yong liu, Xiaoming Liu, Ying Tai

This work presents FaceX framework, a novel facial generalist model capable of handling diverse facial tasks simultaneously.

Facial Editing

AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

1 code implementation10 Dec 2023 Teng Hu, Jiangning Zhang, Ran Yi, Yuzhen Du, Xu Chen, Liang Liu, Yabiao Wang, Chengjie Wang

Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data.

Image Generation

GPT-4V-AD: Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly Detection

1 code implementation5 Nov 2023 Jiangning Zhang, Haoyang He, Xuhai Chen, Zhucun Xue, Yabiao Wang, Chengjie Wang, Lei Xie, Yong liu

Large Multimodal Model (LMM) GPT-4V(ision) endows GPT-4 with visual grounding capabilities, making it possible to handle certain tasks through the Visual Question Answering (VQA) paradigm.

Anomaly Detection Question Answering +3

Toward High Quality Facial Representation Learning

1 code implementation7 Sep 2023 Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Liang Liu, Yabiao Wang, Chengjie Wang

To improve the facial representation quality, we use feature map of a pre-trained visual backbone as a supervision item and use a partially pre-trained decoder for mask image modeling.

Contrastive Learning Face Alignment +2

Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption

1 code implementation ICCV 2023 Teng Hu, Jiangning Zhang, Liang Liu, Ran Yi, Siqi Kou, Haokun Zhu, Xu Chen, Yabiao Wang, Chengjie Wang, Lizhuang Ma

To address these problems, we propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss, which targets different learning objectives at distinct training stages of the diffusion model.

Domain Adaptation

PVG: Progressive Vision Graph for Vision Recognition

no code implementations1 Aug 2023 Jiafu Wu, Jian Li, Jiangning Zhang, Boshen Zhang, Mingmin Chi, Yabiao Wang, Chengjie Wang

Convolution-based and Transformer-based vision backbone networks process images into the grid or sequence structures, respectively, which are inflexible for capturing irregular objects.

graph construction

APRIL-GAN: A Zero-/Few-Shot Anomaly Classification and Segmentation Method for CVPR 2023 VAND Workshop Challenge Tracks 1&2: 1st Place on Zero-shot AD and 4th Place on Few-shot AD

2 code implementations27 May 2023 Xuhai Chen, Yue Han, Jiangning Zhang

In this challenge, our method achieved first place in the zero-shot track, especially excelling in segmentation with an impressive F1 score improvement of 0. 0489 over the second-ranked participant.

Anomaly Classification Novelty Detection

Dual Path Transformer with Partition Attention

no code implementations24 May 2023 Zhengkai Jiang, Liang Liu, Jiangning Zhang, Yabiao Wang, Mingang Chen, Chengjie Wang

This paper introduces a novel attention mechanism, called dual attention, which is both efficient and effective.

Image Classification object-detection +2

Learning Global-aware Kernel for Image Harmonization

no code implementations ICCV 2023 Xintian Shen, Jiangning Zhang, Jun Chen, Shipeng Bai, Yue Han, Yabiao Wang, Chengjie Wang, Yong liu

To address this issue, we propose a novel Global-aware Kernel Network (GKNet) to harmonize local regions with comprehensive consideration of long-distance background references.

Image Harmonization

Transavs: End-To-End Audio-Visual Segmentation With Transformer

no code implementations12 May 2023 Yuhang Ling, Yuxi Li, Zhenye Gan, Jiangning Zhang, Mingmin Chi, Yabiao Wang

Generally AVS faces two key challenges: (1) Audio signals inherently exhibit a high degree of information density, as sounds produced by multiple objects are entangled within the same audio stream; (2) Objects of the same category tend to produce similar audio signals, making it difficult to distinguish between them and thus leading to unclear segmentation results.

Scene Understanding Segmentation +1

High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning

no code implementations CVPR 2023 Chao Xu, Junwei Zhu, Jiangning Zhang, Yue Han, Wenqing Chu, Ying Tai, Chengjie Wang, Zhifeng Xie, Yong liu

Specifically, we supplement the emotion style in text prompts and use an Aligned Multi-modal Emotion encoder to embed the text, image, and audio emotion modality into a unified space, which inherits rich semantic prior from CLIP.

Talking Face Generation

Multimodal-driven Talking Face Generation via a Unified Diffusion-based Generator

no code implementations4 May 2023 Chao Xu, Shaoting Zhu, Junwei Zhu, Tianxin Huang, Jiangning Zhang, Ying Tai, Yong liu

More specifically, given a textured face as the source and the rendered face projected from the desired 3DMM coefficients as the target, our proposed Texture-Geometry-aware Diffusion Model decomposes the complex transfer problem into multi-conditional denoising process, where a Texture Attention-based module accurately models the correspondences between appearance and geometry cues contained in source and target conditions, and incorporate extra implicit information for high-fidelity talking face generation.

Denoising Face Swapping +1

Calibrated Teacher for Sparsely Annotated Object Detection

1 code implementation14 Mar 2023 Haohan Wang, Liang Liu, Boshen Zhang, Jiangning Zhang, Wuhao Zhang, Zhenye Gan, Yabiao Wang, Chengjie Wang, Haoqian Wang

Recent works on sparsely annotated object detection alleviate this problem by generating pseudo labels for the missing annotations.

Object object-detection +2

Multimodal Industrial Anomaly Detection via Hybrid Fusion

1 code implementation CVPR 2023 Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Yabiao Wang, Chengjie Wang

2D-based Industrial Anomaly Detection has been widely discussed, however, multimodal industrial anomaly detection based on 3D point clouds and RGB images still has many untouched fields.

Ranked #3 on RGB+3D Anomaly Detection and Segmentation on MVTEC 3D-AD (using extra training data)

Contrastive Learning RGB+3D Anomaly Detection and Segmentation

Learning with Noisy labels via Self-supervised Adversarial Noisy Masking

1 code implementation CVPR 2023 Yuanpeng Tu, Boshen Zhang, Yuxi Li, Liang Liu, Jian Li, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Cai Rong Zhao

Collecting large-scale datasets is crucial for training deep models, annotating the data, however, inevitably yields noisy labels, which poses challenges to deep learning algorithms.

Ranked #2 on Image Classification on Clothing1M (using extra training data)

Learning with noisy labels

Self-Supervised Likelihood Estimation with Energy Guidance for Anomaly Segmentation in Urban Scenes

1 code implementation14 Feb 2023 Yuanpeng Tu, Yuxi Li, Boshen Zhang, Liang Liu, Jiangning Zhang, Yabiao Wang, Cai Rong Zhao

Based on the proposed estimators, we devise an adaptive self-supervised training framework, which exploits the contextual reliance and estimated likelihood to refine mask annotations in anomaly areas.

Anomaly Detection Autonomous Driving

Rethinking Mobile Block for Efficient Attention-based Models

1 code implementation ICCV 2023 Jiangning Zhang, Xiangtai Li, Jian Li, Liang Liu, Zhucun Xue, Boshen Zhang, Zhengkai Jiang, Tianxin Huang, Yabiao Wang, Chengjie Wang

This paper focuses on developing modern, efficient, lightweight models for dense predictions while trading off parameters, FLOPs, and performance.

Unity

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

1 code implementation3 Jan 2023 Yue Han, Jiangning Zhang, Zhucun Xue, Chao Xu, Xintian Shen, Yabiao Wang, Chengjie Wang, Yong liu, Xiangtai Li

In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework.

Benchmarking Few-Shot Object Detection +3

Learning To Measure the Point Cloud Reconstruction Loss in a Representation Space

no code implementations CVPR 2023 Tianxin Huang, Zhonggan Ding, Jiangning Zhang, Ying Tai, Zhenyu Zhang, Mingang Chen, Chengjie Wang, Yong liu

Specifically, we use the contrastive constraint to help CALoss learn a representation space with shape similarity, while we introduce the adversarial strategy to help CALoss mine differences between reconstructed results and ground truths.

Point cloud reconstruction

EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm

1 code implementation19 Jun 2022 Jiangning Zhang, Xiangtai Li, Yabiao Wang, Chengjie Wang, Yibo Yang, Yong liu, DaCheng Tao

Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derives that both have consistent mathematical formulation.

Image Classification

Region-Aware Face Swapping

no code implementations CVPR 2022 Chao Xu, Jiangning Zhang, Miao Hua, Qian He, Zili Yi, Yong liu

This paper presents a novel Region-Aware Face Swapping (RAFSwap) network to achieve identity-consistent harmonious high-resolution face generation in a local-global manner: \textbf{1)} Local Facial Region-Aware (FRA) branch augments local identity-relevant features by introducing the Transformer to effectively model misaligned cross-scale semantic interaction.

Face Generation Face Swapping +1

Omni-frequency Channel-selection Representations for Unsupervised Anomaly Detection

1 code implementation1 Mar 2022 Yufei Liang, Jiangning Zhang, Shiwei Zhao, Runze Wu, Yong liu, Shuwen Pan

Density-based and classification-based methods have ruled unsupervised anomaly detection in recent years, while reconstruction-based methods are rarely mentioned for the poor reconstruction ability and low performance.

Unsupervised Anomaly Detection

SCSNet: An Efficient Paradigm for Learning Simultaneously Image Colorization and Super-Resolution

no code implementations12 Jan 2022 Jiangning Zhang, Chao Xu, Jian Li, Yue Han, Yabiao Wang, Ying Tai, Yong liu

In the practical application of restoring low-resolution gray-scale images, we generally need to run three separate processes of image colorization, super-resolution, and dows-sampling operation for the target device.

Colorization Image Colorization +1

SelFSR: Self-Conditioned Face Super-Resolution in the Wild via Flow Field Degradation Network

no code implementations20 Dec 2021 Xianfang Zeng, Jiangning Zhang, Liang Liu, Guangzhong Tian, Yong liu

To tackle this problem, we propose a novel domain-adaptive degradation network for face super-resolution in the wild.

Super-Resolution

Riemannian Manifold Embeddings for Straight-Through Estimator

no code implementations29 Sep 2021 Jun Chen, Hanwen Chen, Jiangning Zhang, Yuang Liu, Tianxin Huang, Yong liu

Quantized Neural Networks (QNNs) aim at replacing full-precision weights $\boldsymbol{W}$ with quantized weights $\boldsymbol{\hat{W}}$, which make it possible to deploy large models to mobile and miniaturized devices easily.

Quantization

Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model

1 code implementation NeurIPS 2021 Jiangning Zhang, Chao Xu, Jian Li, Wenzhou Chen, Yabiao Wang, Ying Tai, Shuo Chen, Chengjie Wang, Feiyue Huang, Yong liu

Inspired by biological evolution, we explain the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derive that both of them have consistent mathematical representation.

Image Retrieval Retrieval

RFNet: Recurrent Forward Network for Dense Point Cloud Completion

no code implementations ICCV 2021 Tianxin Huang, Hao Zou, Jinhao Cui, Xuemeng Yang, Mengmeng Wang, Xiangrui Zhao, Jiangning Zhang, Yi Yuan, Yifan Xu, Yong liu

The RFE extracts multiple global features from the incomplete point clouds for different recurrent levels, and the FDC generates point clouds in a coarse-to-fine pipeline.

Point Cloud Completion

Optimizing Quantized Neural Networks with Natural Gradient

no code implementations1 Jan 2021 Jun Chen, Hanwen Chen, Jiangning Zhang, Wenzhou Chen, Yong liu, Yunliang Jiang

Quantized Neural Networks (QNNs) have achieved an enormous step in improving computational efficiency, making it possible to deploy large models to mobile and miniaturized devices.

Computational Efficiency

APB2FaceV2: Real-Time Audio-Guided Multi-Face Reenactment

1 code implementation25 Oct 2020 Jiangning Zhang, Xianfang Zeng, Chao Xu, Jun Chen, Yong liu, Yunliang Jiang

Audio-guided face reenactment aims to generate a photorealistic face that has matched facial expression with the input audio.

Face Reenactment

Hierarchical and Efficient Learning for Person Re-Identification

no code implementations18 May 2020 Jiangning Zhang, Liang Liu, Chao Xu, Yong liu

Recent works in the person re-identification task mainly focus on the model accuracy while ignore factors related to the efficiency, e. g. model size and latency, which are critical for practical application.

Person Re-Identification

APB2Face: Audio-guided face reenactment with auxiliary pose and blink signals

3 code implementations30 Apr 2020 Jiangning Zhang, Liang Liu, Zhu-Cun Xue, Yong liu

Audio-guided face reenactment aims at generating photorealistic faces using audio information while maintaining the same facial movement as when speaking to a real person.

Face Reenactment

Realistic Face Reenactment via Self-Supervised Disentangling of Identity and Pose

no code implementations29 Mar 2020 Xianfang Zeng, Yusu Pan, Mengmeng Wang, Jiangning Zhang, Yong liu

On the one hand, we adopt the deforming autoencoder to disentangle identity and pose representations.

Face Reenactment

FReeNet: Multi-Identity Face Reenactment

1 code implementation CVPR 2020 Jiangning Zhang, Xianfang Zeng, Mengmeng Wang, Yusu Pan, Liang Liu, Yong liu, Yu Ding, Changjie Fan

This paper presents a novel multi-identity face reenactment framework, named FReeNet, to transfer facial expressions from an arbitrary source face to a target face with a shared model.

Face Reenactment

Cannot find the paper you are looking for? You can Submit a new open access paper.