Search Results for author: Chaoyou Fu

Found 23 papers, 12 papers with code

Aligning and Prompting Everything All at Once for Universal Visual Perception

1 code implementation4 Dec 2023 Yunhang Shen, Chaoyou Fu, Peixian Chen, Mengdan Zhang, Ke Li, Xing Sun, Yunsheng Wu, Shaohui Lin, Rongrong Ji

However, predominant paradigms, driven by casting instance-level tasks as an object-word alignment, bring heavy cross-modality interaction, which is not effective in prompting object detection and visual grounding.

Object object-detection +6

Woodpecker: Hallucination Correction for Multimodal Large Language Models

1 code implementation24 Oct 2023 Shukang Yin, Chaoyou Fu, Sirui Zhao, Tong Xu, Hao Wang, Dianbo Sui, Yunhang Shen, Ke Li, Xing Sun, Enhong Chen

Hallucination is a big shadow hanging over the rapidly evolving Multimodal Large Language Models (MLLMs), referring to the phenomenon that the generated text is inconsistent with the image content.

Hallucination

Audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis

no code implementations31 Aug 2023 Linsen Song, Wayne Wu, Chaoyou Fu, Chen Change Loy, Ran He

Existing automated dubbing methods are usually designed for Professionally Generated Content (PGC) production, which requires massive training data and training time to learn a person-specific audio-video mapping.

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

3 code implementations23 Jun 2023 Chaoyou Fu, Peixian Chen, Yunhang Shen, Yulei Qin, Mengdan Zhang, Xu Lin, Jinrui Yang, Xiawu Zheng, Ke Li, Xing Sun, Yunsheng Wu, Rongrong Ji

Multimodal Large Language Model (MLLM) relies on the powerful LLM to perform multimodal tasks, showing amazing emergent abilities in recent studies, such as writing poems based on an image.

Benchmarking Language Modelling +3

A Survey on Multimodal Large Language Models

1 code implementation23 Jun 2023 Shukang Yin, Chaoyou Fu, Sirui Zhao, Ke Li, Xing Sun, Tong Xu, Enhong Chen

Multimodal Large Language Model (MLLM) recently has been a new rising research hotspot, which uses powerful Large Language Models (LLMs) as a brain to perform multimodal tasks.

In-Context Learning Language Modelling +4

Multi-modal Queried Object Detection in the Wild

1 code implementation NeurIPS 2023 Yifan Xu, Mengdan Zhang, Chaoyou Fu, Peixian Chen, Xiaoshan Yang, Ke Li, Changsheng Xu

To address the learning inertia problem brought by the frozen detector, a vision conditioned masked language prediction strategy is proposed.

Few-Shot Object Detection Object +2

Heterogeneous Face Recognition via Face Synthesis with Identity-Attribute Disentanglement

no code implementations10 Jun 2022 Ziming Yang, Jian Liang, Chaoyou Fu, Mandi Luo, Xiao-Yu Zhang

Secondly, we devise a face synthesis module (FSM) to generate a large number of images with stochastic combinations of disentangled identities and attributes for enriching the attribute diversity of synthetic images.

Attribute Data Augmentation +4

Causal Representation Learning for Context-Aware Face Transfer

no code implementations4 Oct 2021 Gege Gao, Huaibo Huang, Chaoyou Fu, Ran He

Human face synthesis involves transferring knowledge about the identity and identity-dependent face shape (IDFS) of a human face to target face images where the context (e. g., facial expressions, head poses, and other background factors) may change dramatically.

counterfactual Counterfactual Inference +4

Pareidolia Face Reenactment

no code implementations CVPR 2021 Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, Ran He

We present a new application direction named Pareidolia Face Reenactment, which is defined as animating a static illusory face to move in tandem with a human face in the video.

Face Reenactment Texture Synthesis

Information Bottleneck Disentanglement for Identity Swapping

1 code implementation CVPR 2021 Gege Gao, Huaibo Huang, Chaoyou Fu, Zhaoyang Li, Ran He

In this work, we propose a novel information disentangling and swapping network, called InfoSwap, to extract the most expressive information for identity representation from a pre-trained face recognition model.

Disentanglement Face Recognition +1

Everything's Talkin': Pareidolia Face Reenactment

1 code implementation7 Apr 2021 Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, Ran He

We present a new application direction named Pareidolia Face Reenactment, which is defined as animating a static illusory face to move in tandem with a human face in the video.

Face Reenactment Texture Synthesis

CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification

1 code implementation ICCV 2021 Chaoyou Fu, Yibo Hu, Xiang Wu, Hailin Shi, Tao Mei, Ran He

Visible-Infrared person re-identification (VI-ReID) aims to match cross-modality pedestrian images, breaking through the limitation of single-modality person ReID in dark environment.

Neural Architecture Search Person Re-Identification

AOT: Appearance Optimal Transport Based Identity Swapping for Forgery Detection

no code implementations NeurIPS 2020 Hao Zhu, Chaoyou Fu, Qianyi Wu, Wayne Wu, Chen Qian, Ran He

However, due to the lack of Deepfakes datasets with large variance in appearance, which can be hardly produced by recent identity swapping methods, the detection algorithm may fail in this situation.

DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition

1 code implementation20 Sep 2020 Chaoyou Fu, Xiang Wu, Yibo Hu, Huaibo Huang, Ran He

As a consequence, massive new diverse paired heterogeneous images with the same identity can be generated from noises.

Contrastive Learning Face Recognition +1

Deep Momentum Uncertainty Hashing

no code implementations17 Sep 2020 Chaoyou Fu, Guoli Wang, Xiang Wu, Qian Zhang, Ran He

It embodies the uncertainty of the hashing network to the corresponding input image.

Combinatorial Optimization Deep Hashing

Dual Variational Generation for Low Shot Heterogeneous Face Recognition

no code implementations NeurIPS 2019 Chaoyou Fu, Xiang Wu, Yibo Hu, Huaibo Huang, Ran He

Specifically, we first introduce a dual variational autoencoder to represent a joint distribution of paired heterogeneous images.

Face Recognition Heterogeneous Face Recognition

Cross-Spectral Face Hallucination via Disentangling Independent Factors

no code implementations CVPR 2020 Boyan Duan, Chaoyou Fu, Yi Li, Xingguang Song, Ran He

The cross-sensor gap is one of the challenges that have aroused much research interests in Heterogeneous Face Recognition (HFR).

Face Alignment Face Hallucination +3

High Fidelity Face Manipulation with Extreme Poses and Expressions

no code implementations28 Mar 2019 Chaoyou Fu, Yibo Hu, Xiang Wu, Guoli Wang, Qian Zhang, Ran He

Furthermore, due to the lack of high-resolution face manipulation databases to verify the effectiveness of our method, we collect a new high-quality Multi-View Face (MVF-HQ) database.

Face Generation Face Recognition +1

Dual Variational Generation for Low-Shot Heterogeneous Face Recognition

1 code implementation25 Mar 2019 Chaoyou Fu, Xiang Wu, Yibo Hu, Huaibo Huang, Ran He

Then, in order to ensure the identity consistency of the generated paired heterogeneous images, we impose a distribution alignment in the latent space and a pairwise identity preserving in the image space.

Face Recognition Heterogeneous Face Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.