Search Results for author: Zhentao Tan

Found 18 papers, 10 papers with code

AnyPattern: Towards In-context Image Copy Detection

1 code implementation • 21 Apr 2024 • Wenhao Wang, Yifan Sun, Zhentao Tan, Yi Yang

This paper explores in-context learning for image copy detection (ICD), i. e., prompting an ICD model to identify replicated images with new tampering patterns without the need for additional training.

Copy Detection In-Context Learning

Paper
Code

Transformer based Pluralistic Image Completion with Reduced Information Loss

1 code implementation • 31 Mar 2024 • Qiankun Liu, Yuqi Jiang, Zhentao Tan, Dongdong Chen, Ying Fu, Qi Chu, Gang Hua, Nenghai Yu

The indices of quantized pixels are used as tokens for the inputs and prediction targets of the transformer.

Image Inpainting Quantization

148

Paper
Code

MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection

1 code implementation • 4 Mar 2024 • Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Jieping Ye, Nenghai Yu

Inspired by the recent basic model with linear complexity for long-distance modeling, called Mamba, we explore the potential of this state space model for ISTD task in terms of effectiveness and efficiency in the paper.

Sentence

Paper
Code

Bootstrapping Audio-Visual Segmentation by Strengthening Audio Cues

no code implementations • 4 Feb 2024 • Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Le Lu, Jieping Ye, Nenghai Yu

This bidirectional interaction narrows the modality imbalance, facilitating more effective learning of integrated audio-visual representations.

Representation Learning

Paper
Add Code

TCI-Former: Thermal Conduction-Inspired Transformer for Infrared Small Target Detection

no code implementations • 3 Feb 2024 • Tianxiang Chen, Zhentao Tan, Qi Chu, Yue Wu, Bin Liu, Nenghai Yu

We abstract this process as the directional movement of feature map pixels to target areas through convolution, pooling and interactions with surrounding pixels, which can be analogous to the movement of thermal particles constrained by surrounding variables and particles.

Paper
Add Code

SimAC: A Simple Anti-Customization Method against Text-to-Image Synthesis of Diffusion Models

no code implementations • 13 Dec 2023 • Feifei Wang, Zhentao Tan, Tianyi Wei, Yue Wu, Qidong Huang

Despite the success of diffusion-based customization methods on visual content creation, increasing concerns have been raised about such techniques from both privacy and political perspectives.

Denoising Image Generation

Paper
Add Code

Towards More Unified In-context Visual Understanding

no code implementations • 5 Dec 2023 • Dianmo Sheng, Dongdong Chen, Zhentao Tan, Qiankun Liu, Qi Chu, Jianmin Bao, Tao Gong, Bin Liu, Shengwei Xu, Nenghai Yu

Thanks to this design, the model is capable of handling in-context vision understanding tasks with multimodal output in a unified pipeline. Experimental results demonstrate that our model achieves competitive performance compared with specialized models and previous ICL baselines.

Image Captioning In-Context Learning +1

Paper
Add Code

Exploring the Application of Large-scale Pre-trained Models on Adverse Weather Removal

no code implementations • 15 Jun 2023 • Zhentao Tan, Yue Wu, Qiankun Liu, Qi Chu, Le Lu, Jieping Ye, Nenghai Yu

Inspired by the various successful applications of large-scale pre-trained models (e. g, CLIP), in this paper, we explore the potential benefits of them for this task through both spatial feature representation learning and semantic information embedding aspects: 1) for spatial feature representation learning, we design a Spatially-Adaptive Residual (\textbf{SAR}) Encoder to extract degraded areas adaptively.

Image Restoration Representation Learning

Paper
Add Code

HQ-50K: A Large-scale, High-quality Dataset for Image Restoration

1 code implementation • 8 Jun 2023 • Qinhong Yang, Dongdong Chen, Zhentao Tan, Qiankun Liu, Qi Chu, Jianmin Bao, Lu Yuan, Gang Hua, Nenghai Yu

This paper introduces a new large-scale image restoration dataset, called HQ-50K, which contains 50, 000 high-quality images with rich texture details and semantic diversity.

Denoising Image Restoration +2

Paper
Code

Multi-spectral Class Center Network for Face Manipulation Detection and Localization

1 code implementation • 18 May 2023 • Changtao Miao, Qi Chu, Zhentao Tan, Zhenchao Jin, Wanyi Zhuang, Yue Wu, Bin Liu, Honggang Hu, Nenghai Yu

Next, a novel Multi-Spectral Class Center Network (MSCCNet) is proposed for face manipulation detection and localization.

Face Swapping

Paper
Code

Video Action Segmentation via Contextually Refined Temporal Keypoints

no code implementations • ICCV 2023 • Borui Jiang, Yang Jin, Zhentao Tan, Yadong Mu

Video action segmentation refers to the task of densely casting each video frame or short segment in an untrimmed video into some pre-specified action categories.

Action Segmentation Graph Matching +1

Paper
Add Code

UIA-ViT: Unsupervised Inconsistency-Aware Method based on Vision Transformer for Face Forgery Detection

no code implementations • 23 Oct 2022 • Wanyi Zhuang, Qi Chu, Zhentao Tan, Qiankun Liu, Haojie Yuan, Changtao Miao, Zixiang Luo, Nenghai Yu

UPCL is designed for learning the consistency-related representation with progressive optimized pseudo annotations.

Representation Learning

Paper
Add Code

Reduce Information Loss in Transformers for Pluralistic Image Inpainting

1 code implementation • CVPR 2022 • Qiankun Liu, Zhentao Tan, Dongdong Chen, Qi Chu, Xiyang Dai, Yinpeng Chen, Mengchen Liu, Lu Yuan, Nenghai Yu

The indices of quantized pixels are used as tokens for the inputs and prediction targets of transformer.

Ranked #6 on Seeing Beyond the Visible on KITTI360-EX

Image Inpainting Quantization +1

148

Paper
Code

HairCLIP: Design Your Hair by Text and Reference Image

1 code implementation • CVPR 2022 • Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Zhentao Tan, Lu Yuan, Weiming Zhang, Nenghai Yu

Hair editing is an interesting and challenging problem in computer vision and graphics.

Attribute

488

Paper
Code

Diverse Semantic Image Synthesis via Probability Distribution Modeling

1 code implementation • CVPR 2021 • Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Bin Liu, Gang Hua, Nenghai Yu

In this paper, we propose a novel diverse semantic image synthesis framework from the perspective of semantic class distributions, which naturally supports diverse generation at semantic or even instance level.

Ranked #1 on Image-to-Image Translation on ADE20K Labels-to-Photos (LPIPS metric)

Image-to-Image Translation

Paper
Code

Efficient Semantic Image Synthesis via Class-Adaptive Normalization

1 code implementation • 8 Dec 2020 • Zhentao Tan, Dongdong Chen, Qi Chu, Menglei Chai, Jing Liao, Mingming He, Lu Yuan, Gang Hua, Nenghai Yu

Spatially-adaptive normalization (SPADE) is remarkably successful recently in conditional semantic image synthesis \cite{park2019semantic}, which modulates the normalized activation with spatially-varying transformations learned from semantic layouts, to prevent the semantic information from being washed away.

Image Generation

Paper
Code

MichiGAN: Multi-Input-Conditioned Hair Image Generation for Portrait Editing

1 code implementation • 30 Oct 2020 • Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Lu Yuan, Sergey Tulyakov, Nenghai Yu

In this paper, we present MichiGAN (Multi-Input-Conditioned Hair Image GAN), a novel conditional image generation method for interactive portrait hair manipulation.

Conditional Image Generation

289

Paper
Code

Rethinking Spatially-Adaptive Normalization

no code implementations • 6 Apr 2020 • Zhentao Tan, Dongdong Chen, Qi Chu, Menglei Chai, Jing Liao, Mingming He, Lu Yuan, Nenghai Yu

Despite its impressive performance, a more thorough understanding of the true advantages inside the box is still highly demanded, to help reduce the significant computation and parameter overheads introduced by these new structures.

Image Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.