X2Face: A network for controlling face generation using images, audio, and pose codes

The objective of this paper is a neural network model that controls the pose and expression of a given face, using another face or modality (e.g. audio). This model can then be used for lightweight, sophisticated video and image editing... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Datasets


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK BENCHMARK
Talking Head Generation VoxCeleb1 - 1-shot learning X2Face FID 45.8 # 2
Talking Head Generation VoxCeleb1 - 32-shot learning X2Face FID 56.5 # 2
Talking Head Generation VoxCeleb1 - 8-shot learning X2Face FID 51.5 # 2

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet