SSCGAN: Facial Attribute Editing via Style Skip Connections

no code implementations ECCV 2020 Wenqing Chu, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Rongrong Ji

Each connection extracts the style feature of the latent feature maps in the encoder and then performs a residual learning based mapping function in the global information space guided by the target attributes.

Learning Multi-view Anomaly Detection

no code implementations16 Jul 2024 Haoyang He, Jiangning Zhang, Guanzhong Tian, Chengjie Wang, Lei Xie

This study explores the recently proposed challenging multi-view Anomaly Detection (AD) task.

PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision

no code implementations9 Jul 2024 Chengjie Wang, Chengming Xu, Zhenye Gan, Jianlong Hu, Wenbing Zhu, Lizhuag Ma

Positive and Unlabeled (PU) learning, a binary classification model trained with only positive and unlabeled data, generally suffers from overfitted risk estimation due to inconsistent data distributions.

Oracle Bone Inscriptions Multi-modal Dataset

no code implementations4 Jul 2024 Bang Li, Donghao Luo, Yujie Liang, Jing Yang, Zengmao Ding, Xu Peng, Boyuan Jiang, Shengwei Han, Dan Sui, Peichao Qin, Pian Wu, Chaoyang Wang, Yun Qi, Taisong Jin, Chengjie Wang, Xiaoming Huang, Zhan Shu, Rongrong Ji, Yongge Liu, Yunsheng Wu

Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography.

Enhancing Multi-Class Anomaly Detection via Diffusion Refinement with Dual Conditioning

no code implementations2 Jul 2024 Jiawei Zhan, Jinxiang Lai, Bin-Bin Gao, Jun Liu, Xiaochen Chen, Chengjie Wang

This approach leverages diffusion to obtain high-frequency information for refinement, greatly alleviating the blurry reconstruction problem while maintaining the sampling efficiency of the reverse diffusion process.

RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network

no code implementations26 Jun 2024 Xiaozhong Ji, Chuming Lin, Zhonggan Ding, Ying Tai, Jian Yang, Junwei Zhu, Xiaobin Hu, Jiangning Zhang, Donghao Luo, Chengjie Wang

In the second component, we design a lightweight facial identity alignment (FIA) module which includes a lip-shape control structure and a face texture reference structure.

DF40: Toward Next-Generation Deepfake Detection

no code implementations19 Jun 2024 Zhiyuan Yan, Taiping Yao, Shen Chen, Yandan Zhao, Xinghe Fu, Junwei Zhu, Donghao Luo, Li Yuan, Chengjie Wang, Shouhong Ding, Yunsheng Wu

In this work, we found the dataset (both train and test) can be the "primary culprit" due to: (1) forgery diversity: Deepfake techniques are commonly referred to as both face forgery (face-swapping and face-reenactment) and entire image synthesis (AIGC).

AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection

1 code implementation17 Jun 2024 Lingjie Kong, Kai Wu, Xiaobin Hu, Wenhui Han, Jinlong Peng, Chengming Xu, Donghao Luo, Jiangning Zhang, Chengjie Wang, Yanwei Fu

In addition, we propose an ID-aware decoupling module to disentangle ID-related information from non-ID elements in the extracted representations for high-fidelity generation of both identity and text descriptions.

Decision Boundary-aware Knowledge Consolidation Generates Better Instance-Incremental Learner

no code implementations5 Jun 2024 Qiang Nie, WeiFu Fu, Yuhuan Lin, Jialin Li, Yifeng Zhou, Yong liu, Lei Zhu, Chengjie Wang

Two issues have to be tackled in the new IIL setting: 1) the notorious catastrophic forgetting because of no access to old data, and 2) broadening the existing decision boundary to new observations because of concept drift.

ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection

1 code implementation5 Jun 2024 Jiangning Zhang, Haoyang He, Zhenye Gan, Qingdong He, Yuxuan Cai, Zhucun Xue, Yabiao Wang, Chengjie Wang, Lei Xie, Yong liu

This paper addresses this issue by proposing a comprehensive visual anomaly detection benchmark, \textbf{\textit{ADer}}, which is a modular framework that is highly extensible for new methods.

NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models

no code implementations30 May 2024 Kai Wu, Boyuan Jiang, Zhengkai Jiang, Qingdong He, Donghao Luo, Shengzhi Wang, Qingwen Liu, Chengjie Wang

Multimodal large language models (MLLMs) contribute a powerful mechanism to understanding visual information building on large language models.


VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation

no code implementations28 May 2024 Qilin Wang, Zhengkai Jiang, Chengming Xu, Jiangning Zhang, Yabiao Wang, Xinyi Zhang, Yun Cao, Weijian Cao, Chengjie Wang, Yanwei Fu

This enables accurate alignment of pose and shape in the generated videos, providing a robust framework capable of handling a wide range of body shapes and dynamic hand movements.

AdapNet: Adaptive Noise-Based Network for Low-Quality Image Retrieval

no code implementations28 May 2024 Sihe Zhang, Qingdong He, Jinlong Peng, Yuxi Li, Zhengkai Jiang, Jiafu Wu, Mingmin Chi, Yabiao Wang, Chengjie Wang

To mitigate this issue, we introduce a novel setting for low-quality image retrieval, and propose an Adaptive Noise-Based Network (AdapNet) to learn robust abstract representations.

PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud Learning

no code implementations24 May 2024 Qingdong He, Jiangning Zhang, Jinlong Peng, Haoyang He, Yabiao Wang, Chengjie Wang

Transformers have revolutionized the point cloud learning task, but the quadratic complexity hinders its extension to long sequence and makes a burden on limited computational resources.

Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

no code implementations21 May 2024 Yue Han, Junwei Zhu, Keke He, Xu Chen, Yanhao Ge, Wei Li, Xiangtai Li, Jiangning Zhang, Chengjie Wang, Yong liu

We observe that both face reenactment/swapping tasks essentially involve combinations of target structure, ID and attribute.

Efficient Multimodal Large Language Models: A Survey

1 code implementation17 May 2024 Yizhang Jin, Jian Li, Yexin Liu, Tianjun Gu, Kai Wu, Zhengkai Jiang, Muyang He, Bo Zhao, Xin Tan, Zhenye Gan, Yabiao Wang, Chengjie Wang, Lizhuang Ma

In the past year, Multimodal Large Language Models (MLLMs) have demonstrated remarkable performance in tasks such as visual question answering, visual understanding and reasoning.

Single-temporal Supervised Remote Change Detection for Domain Generalization

no code implementations17 Apr 2024 Qiangang Du, Jinlong Peng, Xu Chen, Qingdong He, Liren He, Qiang Nie, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wang

In this paper, we propose a multimodal contrastive learning (ChangeCLIP) based on visual-language pre-training for change detection domain generalization.

Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark

1 code implementation16 Apr 2024 Jiangning Zhang, Chengjie Wang, Xiangtai Li, Guanzhong Tian, Zhucun Xue, Yong liu, Guansong Pang, DaCheng Tao

Moreover, current metrics such as AU-ROC have nearly reached saturation on simple datasets, which prevents a comprehensive evaluation of different methods.

Deepfake Generation and Detection: A Benchmark and Survey

1 code implementation26 Mar 2024 Gan Pei, Jiangning Zhang, Menghan Hu, Zhenyu Zhang, Chengjie Wang, Yunsheng Wu, Guangtao Zhai, Jian Yang, Chunhua Shen, DaCheng Tao

Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions, which has significant application potential in fields such as entertainment, movie production, digital human creation, to name a few.

DiffFAE: Advancing High-fidelity One-shot Facial Appearance Editing with Space-sensitive Customization and Semantic Preservation

no code implementations26 Mar 2024 Qilin Wang, Jiangning Zhang, Chengming Xu, Weijian Cao, Ying Tai, Yue Han, Yanhao Ge, Hong Gu, Chengjie Wang, Yanwei Fu

Facial Appearance Editing (FAE) aims to modify physical attributes, such as pose, expression and lighting, of human facial images while preserving attributes like identity and background, showing great importance in photograph.

Toward Multi-class Anomaly Detection: Exploring Class-aware Unified Model against Inter-class Interference

no code implementations21 Mar 2024 Xi Jiang, Ying Chen, Qiang Nie, Jianlin Liu, Yong liu, Chengjie Wang, Feng Zheng

To address this issue, we introduce a Multi-class Implicit Neural representation Transformer for unified Anomaly Detection (MINT-AD), which leverages the fine-grained category information in the training stage.

SoftPatch: Unsupervised Anomaly Detection with Noisy Data

1 code implementation NeurIPS 2022 Xi Jiang, Ying Chen, Qiang Nie, Yong liu, Jianlin Liu, Bin-Bin Gao, Jun Liu, Chengjie Wang, Feng Zheng

Noise discriminators are utilized to generate outlier scores for patch-level noise elimination before coreset construction.

HCPM: Hierarchical Candidates Pruning for Efficient Detector-Free Matching

no code implementations19 Mar 2024 Ying Chen, Yong liu, Kai Wu, Qiang Nie, Shang Xu, Huifang Ma, Bing Wang, Chengjie Wang

Deep learning-based image matching methods play a crucial role in computer vision, yet they often suffer from substantial computational demands.

Tuning-Free Image Customization with Image and Text Guidance

1 code implementation19 Mar 2024 Pengzhi Li, Qiang Nie, Ying Chen, Xi Jiang, Kai Wu, Yuhuan Lin, Yong liu, Jinlong Peng, Chengjie Wang, Feng Zheng

To our knowledge, this is the first tuning-free method that concurrently utilizes text and image guidance for image customization in specific regions.

DMAD: Dual Memory Bank for Real-World Anomaly Detection

no code implementations19 Mar 2024 Jianlong Hu, Xu Chen, Zhenye Gan, Jinlong Peng, Shengchuan Zhang, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Liujuan Cao, Rongrong Ji

To address the challenge of real-world anomaly detection, we propose a new framework named Dual Memory bank enhanced representation learning for Anomaly Detection (DMAD).

Learning Unified Reference Representation for Unsupervised Multi-class Anomaly Detection

no code implementations18 Mar 2024 Liren He, Zhengkai Jiang, Jinlong Peng, Liang Liu, Qiangang Du, Xiaobin Hu, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wang

In the field of multi-class anomaly detection, reconstruction-based methods derived from single-class anomaly detection face the well-known challenge of "learning shortcuts", wherein the model fails to learn the patterns of normal samples as it should, opting instead for shortcuts such as identity mapping or artificial noise elimination.

PointSeg: A Training-Free Paradigm for 3D Scene Segmentation via Foundation Models

no code implementations11 Mar 2024 Qingdong He, Jinlong Peng, Zhengkai Jiang, Xiaobin Hu, Jiangning Zhang, Qiang Nie, Yabiao Wang, Chengjie Wang

On top of that, PointSeg can incorporate with various foundation models and even surpasses the specialist training-based methods by 3. 4$\%$-5. 4$\%$ mAP across various datasets, serving as an effective generalist model.

DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation

no code implementations10 Mar 2024 Xiaobin Hu, Xu Peng, Donghao Luo, Xiaozhong Ji, Jinlong Peng, Zhengkai Jiang, Jiangning Zhang, Taisong Jin, Chengjie Wang, Rongrong Ji

Our DiffuMatting shows several potential applications (e. g., matting-data generator, community-friendly art design and controllable generation).

Dual-path Frequency Discriminators for Few-shot Anomaly Detection

no code implementations7 Mar 2024 Yuhu Bai, Jiangning Zhang, Yuhang Dong, Guanzhong Tian, Liang Liu, Yunkang Cao, Yabiao Wang, Chengjie Wang

We consider anomaly detection as a discriminative classification problem, wherefore the dual-path feature discrimination module is employed to detect and locate the image-level and feature-level anomalies in the feature space.

LORS: Low-rank Residual Structure for Parameter-Efficient Network Stacking

1 code implementation CVPR 2024 Jialin Li, Qiang Nie, WeiFu Fu, Yuhuan Lin, Guangpin Tao, Yong liu, Chengjie Wang

Deep learning models, particularly those based on transformers, often employ numerous stacked structures, which possess identical architectures and perform similar functions.


Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability

no code implementations19 Feb 2024 Xuelin Qian, Yu Wang, Simian Luo, yinda zhang, Ying Tai, Zhenyu Zhang, Chengjie Wang, xiangyang xue, Bo Zhao, Tiejun Huang, Yunsheng Wu, Yanwei Fu

In this paper, we extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.

3D Generation 3D Shape Generation +1

Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection

no code implementations6 Jan 2024 Yuanpeng Tu, Boshen Zhang, Liang Liu, Yuxi Li, Xuhai Chen, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Cai Rong Zhao

Industrial anomaly detection is generally addressed as an unsupervised task that aims at locating defects with only normal training samples.

Unsupervised Continual Anomaly Detection with Contrastively-learned Prompt

1 code implementation2 Jan 2024 Jiaqi Liu, Kai Wu, Qiang Nie, Ying Chen, Bin-Bin Gao, Yong liu, Jinbao Wang, Chengjie Wang, Feng Zheng

Unsupervised Anomaly Detection (UAD) with incremental training is crucial in industrial manufacturing, as unpredictable defects make obtaining sufficient labeled data infeasible.

A Generalist FaceX via Learning Unified Facial Representation

1 code implementation31 Dec 2023 Yue Han, Jiangning Zhang, Junwei Zhu, Xiangtai Li, Yanhao Ge, Wei Li, Chengjie Wang, Yong liu, Xiaoming Liu, Ying Tai

This work presents FaceX framework, a novel facial generalist model capable of handling diverse facial tasks simultaneously.

Beyond Prototypes: Semantic Anchor Regularization for Better Representation Learning

1 code implementation19 Dec 2023 Yanqi Ge, Qiang Nie, Ye Huang, Yong liu, Chengjie Wang, Feng Zheng, Wen Li, Lixin Duan

By pulling the learned features to these semantic anchors, several advantages can be attained: 1) the intra-class compactness and naturally inter-class separability, 2) induced bias or errors from feature learning can be avoided, and 3) robustness to the long-tailed problem.


MatchDet: A Collaborative Framework for Image Matching and Object Detection

no code implementations18 Dec 2023 Jinxiang Lai, Wenlong Wu, Bin-Bin Gao, Jun Liu, Jiawei Zhan, Congchong Nie, Yi Zeng, Chengjie Wang

Image matching and object detection are two fundamental and challenging tasks, while many related applications consider them two individual tasks (i. e. task-individual).

AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

1 code implementation10 Dec 2023 Teng Hu, Jiangning Zhang, Ran Yi, Yuzhen Du, Xu Chen, Liang Liu, Yabiao Wang, Chengjie Wang

Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data.

GPT-4V-AD: Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly Detection

1 code implementation5 Nov 2023 Jiangning Zhang, Haoyang He, Xuhai Chen, Zhucun Xue, Yabiao Wang, Chengjie Wang, Lei Xie, Yong liu

Large Multimodal Model (LMM) GPT-4V(ision) endows GPT-4 with visual grounding capabilities, making it possible to handle certain tasks through the Visual Question Answering (VQA) paradigm.

Real3D-AD: A Dataset of Point Cloud Anomaly Detection

1 code implementation NeurIPS 2023 Jiaqi Liu, Guoyang Xie, Ruitao Chen, Xinpeng Li, Jinbao Wang, Yong liu, Chengjie Wang, Feng Zheng

High-precision point cloud anomaly detection is the gold standard for identifying the defects of advancing machining and precision manufacturing.

Stroke-based Neural Painting and Stylization with Dynamically Predicted Painting Region

2 code implementations7 Sep 2023 Teng Hu, Ran Yi, Haokun Zhu, Liang Liu, Jinlong Peng, Yabiao Wang, Chengjie Wang, Lizhuang Ma

To solve the problem, we propose Compositional Neural Painter, a novel stroke-based rendering framework which dynamically predicts the next painting region based on the current canvas, instead of dividing the image plane uniformly into painting regions.

Dynamic Frame Interpolation in Wavelet Domain

1 code implementation7 Sep 2023 Lingtong Kong, Boyuan Jiang, Donghao Luo, Wenqing Chu, Ying Tai, Chengjie Wang, Jie Yang

Video frame interpolation is an important low-level vision task, which can increase frame rate for more fluent visual experience.

Toward High Quality Facial Representation Learning

1 code implementation7 Sep 2023 Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Liang Liu, Yabiao Wang, Chengjie Wang

To improve the facial representation quality, we use feature map of a pre-trained visual backbone as a supervision item and use a partially pre-trained decoder for mask image modeling.

Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption

1 code implementation ICCV 2023 Teng Hu, Jiangning Zhang, Liang Liu, Ran Yi, Siqi Kou, Haokun Zhu, Xu Chen, Yabiao Wang, Chengjie Wang, Lizhuang Ma

To address these problems, we propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss, which targets different learning objectives at distinct training stages of the diffusion model.

Domain Adaptation

PVG: Progressive Vision Graph for Vision Recognition

no code implementations1 Aug 2023 Jiafu Wu, Jian Li, Jiangning Zhang, Boshen Zhang, Mingmin Chi, Yabiao Wang, Chengjie Wang

Convolution-based and Transformer-based vision backbone networks process images into the grid or sequence structures, respectively, which are inflexible for capturing irregular objects.

Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

1 code implementation ICCV 2023 Junjie Fei, Teng Wang, Jinrui Zhang, Zhenyu He, Chengjie Wang, Feng Zheng

In this paper, we propose ViECap, a transferable decoding model that leverages entity-aware decoding to generate descriptions in both seen and unseen scenarios.

RFENet: Towards Reciprocal Feature Evolution for Glass Segmentation

1 code implementation12 Jul 2023 Ke Fan, Changan Wang, Yabiao Wang, Chengjie Wang, Ran Yi, Lizhuang Ma

Glass-like objects are widespread in daily life but remain intractable to be segmented for most existing methods.

Align, Perturb and Decouple: Toward Better Leverage of Difference Information for RSI Change Detection

1 code implementation30 May 2023 Supeng Wang, Yuxi Li, Ming Xie, Mingmin Chi, Yabiao Wang, Chengjie Wang, Wenbing Zhu

In this paper, we revisit the importance of feature difference for change detection in RSI, and propose a series of operations to fully exploit the difference information: Alignment, Perturbation and Decoupling (APD).

Dual Path Transformer with Partition Attention

no code implementations24 May 2023 Zhengkai Jiang, Liang Liu, Jiangning Zhang, Yabiao Wang, Mingang Chen, Chengjie Wang

This paper introduces a novel attention mechanism, called dual attention, which is both efficient and effective.

Learning Global-aware Kernel for Image Harmonization

no code implementations ICCV 2023 Xintian Shen, Jiangning Zhang, Jun Chen, Shipeng Bai, Yue Han, Yabiao Wang, Chengjie Wang, Yong liu

To address this issue, we propose a novel Global-aware Kernel Network (GKNet) to harmonize local regions with comprehensive consideration of long-distance background references.

High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning

no code implementations CVPR 2023 Chao Xu, Junwei Zhu, Jiangning Zhang, Yue Han, Wenqing Chu, Ying Tai, Chengjie Wang, Zhifeng Xie, Yong liu

Specifically, we supplement the emotion style in text prompts and use an Aligned Multi-modal Emotion encoder to embed the text, image, and audio emotion modality into a unified space, which inherits rich semantic prior from CLIP.

NeRF-Loc: Visual Localization with Conditional Neural Radiance Field

1 code implementation17 Apr 2023 Jianlin Liu, Qiang Nie, Yong liu, Chengjie Wang

We propose a novel visual re-localization method based on direct matching between the implicit 3D descriptors and the 2D image with transformer.

Learning Versatile 3D Shape Generation with Improved AR Models

no code implementations26 Mar 2023 Simian Luo, Xuelin Qian, Yanwei Fu, yinda zhang, Ying Tai, Zhenyu Zhang, Chengjie Wang, xiangyang xue

Auto-Regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space.

SpatialFormer: Semantic and Target Aware Attentions for Few-Shot Learning

1 code implementation15 Mar 2023 Jinxiang Lai, Siqian Yang, Wenlong Wu, Tao Wu, Guannan Jiang, Xi Wang, Jun Liu, Bin-Bin Gao, Wei zhang, Yuan Xie, Chengjie Wang

Then we derive two specific attention modules, named SpatialFormer Semantic Attention (SFSA) and SpatialFormer Target Attention (SFTA), to enhance the target object regions while reduce the background distraction.

Calibrated Teacher for Sparsely Annotated Object Detection

1 code implementation14 Mar 2023 Haohan Wang, Liang Liu, Boshen Zhang, Jiangning Zhang, Wuhao Zhang, Zhenye Gan, Yabiao Wang, Chengjie Wang, Haoqian Wang

Recent works on sparsely annotated object detection alleviate this problem by generating pseudo labels for the missing annotations.

Multimodal Industrial Anomaly Detection via Hybrid Fusion

1 code implementation CVPR 2023 Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Yabiao Wang, Chengjie Wang

2D-based Industrial Anomaly Detection has been widely discussed, however, multimodal industrial anomaly detection based on 3D point clouds and RGB images still has many untouched fields.

Ranked #3 on RGB+3D Anomaly Detection and Segmentation on MVTEC 3D-AD (using extra training data)

Learning with Noisy labels via Self-supervised Adversarial Noisy Masking

1 code implementation CVPR 2023 Yuanpeng Tu, Boshen Zhang, Yuxi Li, Liang Liu, Jian Li, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Cai Rong Zhao

Collecting large-scale datasets is crucial for training deep models, annotating the data, however, inevitably yields noisy labels, which poses challenges to deep learning algorithms.

Ranked #2 on Image Classification on Clothing1M (using extra training data)

IM-IAD: Industrial Image Anomaly Detection Benchmark in Manufacturing

2 code implementations31 Jan 2023 Guoyang Xie, Jinbao Wang, Jiaqi Liu, Jiayi Lyu, Yong liu, Chengjie Wang, Feng Zheng, Yaochu Jin

We realize that the lack of a uniform IM benchmark is hindering the development and usage of IAD methods in real-world applications.

Deep Industrial Image Anomaly Detection: A Survey

1 code implementation27 Jan 2023 Jiaqi Liu, Guoyang Xie, Jinbao Wang, Shangnian Li, Chengjie Wang, Feng Zheng, Yaochu Jin

In this paper, we provide a comprehensive review of deep learning-based image anomaly detection techniques, from the perspectives of neural network architectures, levels of supervision, loss functions, metrics and datasets.

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

1 code implementation3 Jan 2023 Yue Han, Jiangning Zhang, Zhucun Xue, Chao Xu, Xintian Shen, Yabiao Wang, Chengjie Wang, Yong liu, Xiangtai Li

In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework.

Rethinking Mobile Block for Efficient Attention-based Models

1 code implementation ICCV 2023 Jiangning Zhang, Xiangtai Li, Jian Li, Liang Liu, Zhucun Xue, Boshen Zhang, Zhengkai Jiang, Tianxin Huang, Yabiao Wang, Chengjie Wang

This paper focuses on developing modern, efficient, lightweight models for dense predictions while trading off parameters, FLOPs, and performance.


Learning Neural Proto-Face Field for Disentangled 3D Face Modeling in the Wild

no code implementations CVPR 2023 Zhenyu Zhang, Renwang Chen, Weijian Cao, Ying Tai, Chengjie Wang

To address this problem, this paper presents a novel Neural Proto-face Field (NPF) for unsupervised robust 3D face modeling.

Learning To Measure the Point Cloud Reconstruction Loss in a Representation Space

no code implementations CVPR 2023 Tianxin Huang, Zhonggan Ding, Jiangning Zhang, Ying Tai, Zhenyu Zhang, Mingang Chen, Chengjie Wang, Yong liu

Specifically, we use the contrastive constraint to help CALoss learn a representation space with shape similarity, while we introduce the adversarial strategy to help CALoss mine differences between reconstructed results and ground truths.

Multi-Centroid Task Descriptor for Dynamic Class Incremental Inference

no code implementations CVPR 2023 Tenghao Cai, Zhizhong Zhang, Xin Tan, Yanyun Qu, Guannan Jiang, Chengjie Wang, Yuan Xie

As a result, our dynamic inference network is trained independently of baseline and provides a flexible, efficient solution to distinguish between tasks.

PatchMix Augmentation to Identify Causal Features in Few-shot Learning

no code implementations29 Nov 2022 Chengming Xu, Chen Liu, Xinwei Sun, Siqian Yang, Yabiao Wang, Chengjie Wang, Yanwei Fu

We theoretically show that such an augmentation mechanism, different from existing ones, is able to identify the causal features.

Global Meets Local: Effective Multi-Label Image Classification via Category-Aware Weak Supervision

no code implementations23 Nov 2022 Jiawei Zhan, Jun Liu, Wei Tang, Guannan Jiang, Xi Wang, Bin-Bin Gao, Tianliang Zhang, Wenlong Wu, Wei zhang, Chengjie Wang, Yuan Xie

This paper builds a unified framework to perform effective noisy-proposal suppression and to interact between global and local features for robust feature learning.

Delving into Transformer for Incremental Semantic Segmentation

no code implementations18 Nov 2022 Zekai Xu, Mingyi Zhang, Jiayue Hou, Xing Gong, Chuan Wen, Chengjie Wang, Junge Zhang

In contrast, a Transformer based method has a natural advantage in curbing catastrophic forgetting due to its ability to model both long-term and short-term tasks.

tSF: Transformer-based Semantic Filter for Few-Shot Learning

1 code implementation2 Nov 2022 Jinxiang Lai, Siqian Yang, Wenlong Liu, Yi Zeng, Zhongyi Huang, Wenlong Wu, Jun Liu, Bin-Bin Gao, Chengjie Wang

Few-Shot Learning (FSL) alleviates the data shortage challenge via embedding discriminative target-aware features among plenty seen (base) and few unseen (novel) labeled samples.

Rethinking the Metric in Few-shot Learning: From an Adaptive Multi-Distance Perspective

no code implementations2 Nov 2022 Jinxiang Lai, Siqian Yang, Guannan Jiang, Xi Wang, Yuxi Li, Zihui Jia, Xiaochen Chen, Jun Liu, Bin-Bin Gao, Wei zhang, Yuan Xie, Chengjie Wang

In this paper, for the first time, we investigate the contributions of different distance metrics, and propose an adaptive fusion scheme, bringing significant improvements in few-shot classification.

Rethinking Dimensionality Reduction in Grid-based 3D Object Detection

no code implementations20 Sep 2022 Dihe Huang, Ying Chen, Yikang Ding, Jinli Liao, Jianlin Liu, Kai Wu, Qiang Nie, Yong liu, Chengjie Wang, Zhiheng Li

In MDRNet, the Spatial-aware Dimensionality Reduction (SDR) is designed to dynamically focus on the valuable parts of the object during voxel-to-BEV feature transformation.

Joint Learning Content and Degradation Aware Feature for Blind Super-Resolution

1 code implementation29 Aug 2022 Yifeng Zhou, Chuming Lin, Donghao Luo, Yong liu, Ying Tai, Chengjie Wang, Mingang Chen

Although some Unsupervised Degradation Prediction (UDP) methods are proposed to bypass this problem, the \textit{inconsistency} between degradation embedding and SR feature is still challenging.

SeedFormer: Patch Seeds based Point Cloud Completion with Upsample Transformer

1 code implementation21 Jul 2022 Haoran Zhou, Yun Cao, Wenqing Chu, Junwei Zhu, Tong Lu, Ying Tai, Chengjie Wang

Point cloud completion has become increasingly popular among generation tasks of 3D point clouds, as it is a challenging yet indispensable problem to recover the complete shape of a 3D object from its partial observation.

Adaptive Assignment for Geometry Aware Local Feature Matching

1 code implementation CVPR 2023 Dihe Huang, Ying Chen, Shang Xu, Yong liu, Wenlong Wu, Yikang Ding, Chengjie Wang, Fan Tang

The detector-free feature matching approaches are currently attracting great attention thanks to their excellent performance.

EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm

1 code implementation19 Jun 2022 Jiangning Zhang, Xiangtai Li, Yabiao Wang, Chengjie Wang, Yibo Yang, Yong liu, DaCheng Tao

Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derives that both have consistent mathematical formulation.

IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation

2 code implementations CVPR 2022 Lingtong Kong, Boyuan Jiang, Donghao Luo, Wenqing Chu, Xiaoming Huang, Ying Tai, Chengjie Wang, Jie Yang

Prevailing video frame interpolation algorithms, that generate the intermediate frames from consecutive inputs, typically rely on complex model architectures with heavy parameters or large delay, hindering them from diverse real-time applications.

OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous Driving

1 code implementation27 May 2022 Guohang Yan, Liu Zhuochun, Chengjie Wang, Chunlei Shi, Pengjin Wei, Xinyu Cai, Tao Ma, Zhizheng Liu, Zebin Zhong, Yuqian Liu, Ming Zhao, Zheng Ma, Yikang Li

To this end, we present OpenCalib, a calibration toolbox that contains a rich set of various sensor calibration methods.

UniInst: Unique Representation for End-to-End Instance Segmentation

1 code implementation25 May 2022 Yimin Ou, Rui Yang, Lufan Ma, Yong liu, Jiangpeng Yan, Shang Xu, Chengjie Wang, Xiu Li

Existing instance segmentation methods have achieved impressive performance but still suffer from a common dilemma: redundant representations (e. g., multiple boxes, grids, and anchor points) are inferred for one instance, which leads to multiple duplicated predictions.

FRIH: Fine-grained Region-aware Image Harmonization

no code implementations13 May 2022 Jinlong Peng, Zekun Luo, Liang Liu, Boshen Zhang, Tao Wang, Yabiao Wang, Ying Tai, Chengjie Wang, Weiyao Lin

Image harmonization aims to generate a more realistic appearance of foreground and background for a composite image.

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

2 code implementations11 May 2022 Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang

The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.

Surface Representation for Point Clouds

1 code implementation CVPR 2022 Haoxi Ran, Jun Liu, Chengjie Wang

Based on a simple baseline of PointNet++ (SSG version), Umbrella RepSurf surpasses the previous state-of-the-art by a large margin for classification, segmentation and detection on various benchmarks in terms of performance and efficiency.

High-resolution Iterative Feedback Network for Camouflaged Object Detection

1 code implementation22 Mar 2022 Xiaobin Hu, Shuo Wang, Xuebin Qin, Hang Dai, Wenqi Ren, Ying Tai, Chengjie Wang, Ling Shao

Spotting camouflaged objects that are visually assimilated into the background is tricky for both object detection algorithms and humans who are usually confused or cheated by the perfectly intrinsic similarities between the foreground objects and the background surroundings.

CtlGAN: Few-shot Artistic Portraits Generation with Contrastive Transfer Learning

no code implementations16 Mar 2022 Yue Wang, Ran Yi, Luying Li, Ying Tai, Chengjie Wang, Lizhuang Ma

We propose a new encoder which embeds real faces into Z+ space and proposes a dual-path training strategy to better cope with the adapted decoder and eliminate the artifacts.

Learning Distinctive Margin toward Active Domain Adaptation

1 code implementation CVPR 2022 Ming Xie, Yuxi Li, Yabiao Wang, Zekun Luo, Zhenye Gan, Zhongyi Sun, Mingmin Chi, Chengjie Wang, Pei Wang

Despite plenty of efforts focusing on improving the domain adaptation ability (DA) under unsupervised or few-shot semi-supervised settings, recently the solution of active learning started to attract more attention due to its suitability in transferring model in a more practical way with limited annotation resource on target data.

A Survey of Visual Sensory Anomaly Detection

1 code implementation14 Feb 2022 Xi Jiang, Guoyang Xie, Jinbao Wang, Yong liu, Chengjie Wang, Feng Zheng, Yaochu Jin

In this survey, we are the first one to provide a comprehensive review of visual sensory AD and category into three levels according to the form of anomalies.

ASFD: Automatic and Scalable Face Detector

no code implementations26 Jan 2022 Jian Li, Bin Zhang, Yabiao Wang, Ying Tai, Zhenyu Zhang, Chengjie Wang, Jilin Li, Xiaoming Huang, Yili Xia

Along with current multi-scale based detectors, Feature Aggregation and Enhancement (FAE) modules have shown superior performance gains for cutting-edge object detection.

CFNet: Learning Correlation Functions for One-Stage Panoptic Segmentation

no code implementations13 Jan 2022 Yifeng Chen, Wenqing Chu, Fangfang Wang, Ying Tai, Ran Yi, Zhenye Gan, Liang Yao, Chengjie Wang, Xi Li

Recently, there is growing attention on one-stage panoptic segmentation methods which aim to segment instances and stuff jointly within a fully convolutional pipeline efficiently.

Learning To Restore 3D Face From In-the-Wild Degraded Images

no code implementations CVPR 2022 Zhenyu Zhang, Yanhao Ge, Ying Tai, Xiaoming Huang, Chengjie Wang, Hao Tang, Dongjin Huang, Zhifeng Xie

In-the-wild 3D face modelling is a challenging problem as the predicted facial geometry and texture suffer from a lack of reliable clues or priors, when the input images are degraded.

En-Compactness: Self-Distillation Embedding & Contrastive Generation for Generalized Zero-Shot Learning

no code implementations CVPR 2022 Xia Kong, Zuodong Gao, Xiaofan Li, Ming Hong, Jun Liu, Chengjie Wang, Yuan Xie, Yanyun Qu

Our ICCE promotes intra-class compactness with inter-class separability on both seen and unseen classes in the embedding space and visual feature space.

Blind Face Restoration via Integrating Face Shape and Generative Priors

no code implementations CVPR 2022 Feida Zhu, Junwei Zhu, Wenqing Chu, Xinyi Zhang, Xiaozhong Ji, Chengjie Wang, Ying Tai

Moreover, we introduce hybrid-level losses to jointly train the shape and generative priors together with other network parts such that these two priors better adapt to our blind face restoration task.

Learning To Memorize Feature Hallucination for One-Shot Image Generation

no code implementations CVPR 2022 Yu Xie, Yanwei Fu, Ying Tai, Yun Cao, Junwei Zhu, Chengjie Wang

In this paper, we propose a novel model to explicitly learn and memorize reusable features that can help hallucinate novel category images.

LCTR: On Awakening the Local Continuity of Transformer for Weakly Supervised Object Localization

no code implementations10 Dec 2021 Zhiwei Chen, Changan Wang, Yabiao Wang, Guannan Jiang, Yunhang Shen, Ying Tai, Chengjie Wang, Wei zhang, Liujuan Cao

In this paper, we propose a novel framework built upon the transformer, termed LCTR (Local Continuity TRansformer), which targets at enhancing the local perception capability of global features among long-range feature dependencies.

Robust Learning with Adaptive Sample Credibility Modeling

no code implementations29 Sep 2021 Boshen Zhang, Yuxi Li, Yuanpeng Tu, Yabiao Wang, Yang Xiao, Cai Rong Zhao, Chengjie Wang

For the clean set, we deliberately design a memory-based modulation scheme to dynamically adjust the contribution of each sample in terms of its historical credibility sequence during training, thus to alleviate the effect from potential hard noisy samples in clean set.


Uniformity in Heterogeneity:Diving Deep into Count Interval Partition for Crowd Counting

3 code implementations27 Jul 2021 Changan Wang, Qingyu Song, Boshen Zhang, Yabiao Wang, Ying Tai, Xuyi Hu, Chengjie Wang, Jilin Li, Jiayi Ma, Yang Wu

Therefore, we propose a novel count interval partition criterion called Uniform Error Partition (UEP), which always keeps the expected counting error contributions equal for all intervals to minimize the prediction risk.

Learning To Restore Hazy Video: A New Real-World Dataset and a New Method

no code implementations CVPR 2021 Xinyi Zhang, Hang Dong, Jinshan Pan, Chao Zhu, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Fei Wang

On the other hand, the video dehazing algorithms, which can acquire more satisfying dehazing results by exploiting the temporal redundancy from neighborhood hazy frames, receive less attention due to the absence of the video dehazing datasets.

HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping

1 code implementation18 Jun 2021 YuHan Wang, Xu Chen, Junwei Zhu, Wenqing Chu, Ying Tai, Chengjie Wang, Jilin Li, Yongjian Wu, Feiyue Huang, Rongrong Ji

In this work, we propose a high fidelity face swapping method, called HifiFace, which can well preserve the face shape of the source face and generate photo-realistic results.

Context-Aware Image Inpainting with Learned Semantic Priors

1 code implementation14 Jun 2021 Wendong Zhang, Junwei Zhu, Ying Tai, Yunbo Wang, Wenqing Chu, Bingbing Ni, Chengjie Wang, Xiaokang Yang

Based on the semantic priors, we further propose a context-aware image inpainting model, which adaptively integrates global semantics and local features in a unified image generator.

Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model

1 code implementation NeurIPS 2021 Jiangning Zhang, Chao Xu, Jian Li, Wenzhou Chen, Yabiao Wang, Ying Tai, Shuo Chen, Chengjie Wang, Feiyue Huang, Yong liu

Inspired by biological evolution, we explain the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derive that both of them have consistent mathematical representation.

SiamRCR: Reciprocal Classification and Regression for Visual Object Tracking

no code implementations24 May 2021 Jinlong Peng, Zhengkai Jiang, Yueyang Gu, Yang Wu, Yabiao Wang, Ying Tai, Chengjie Wang, Weiyao Lin

In addition, we add a localization branch to predict the localization accuracy, so that it can work as the replacement of the regression assistance link during inference.

SCNet: Enhancing Few-Shot Semantic Segmentation by Self-Contrastive Background Prototypes

no code implementations19 Apr 2021 Jiacheng Chen, Bin-Bin Gao, Zongqing Lu, Jing-Hao Xue, Chengjie Wang, Qingmin Liao

To this end, we generate self-contrastive background prototypes directly from the query image, with which we enable the construction of complete sample pairs and thus a complementary and auxiliary segmentation task to achieve the training of a better segmentation model.

Learning Dynamic Alignment via Meta-filter for Few-shot Learning

1 code implementation CVPR 2021 Chengming Xu, Chen Liu, Li Zhang, Chengjie Wang, Jilin Li, Feiyue Huang, xiangyang xue, Yanwei Fu

Our insight is that these methods would lead to poor adaptation with redundant matching, and leveraging channel-wise adjustment is the key to well adapting the learned knowledge to new classes.

