Search Results for author: Aliaksandr Siarohin

Found 38 papers, 20 papers with code

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

no code implementations29 Feb 2024 Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Ekaterina Deyneka, Hsiang-wei Chao, Byung Eun Jeon, Yuwei Fang, Hsin-Ying Lee, Jian Ren, Ming-Hsuan Yang, Sergey Tulyakov

Next, we finetune a retrieval model on a small subset where the best caption of each video is manually selected and then employ the model in the whole dataset to select the best caption as the annotation.

Retrieval Text Retrieval +3

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis

no code implementations22 Feb 2024 Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov, Ekaterina Deyneka, Tsai-Shien Chen, Anil Kag, Yuwei Fang, Aleksei Stoliar, Elisa Ricci, Jian Ren, Sergey Tulyakov

Since video content is highly redundant, we argue that naively bringing advances of image models to the video generation domain reduces motion fidelity, visual quality and impairs scalability.

Image Generation Text-to-Video Generation +1

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion

no code implementations12 Oct 2023 Xian Liu, Jian Ren, Aliaksandr Siarohin, Ivan Skorokhodov, Yanyu Li, Dahua Lin, Xihui Liu, Ziwei Liu, Sergey Tulyakov

Our model enforces the joint learning of image appearance, spatial relationship, and geometry in a unified network, where each branch in the model complements to each other with both structural awareness and textural richness.

Image Generation

AutoDecoding Latent 3D Diffusion Models

1 code implementation NeurIPS 2023 Evangelos Ntavelis, Aliaksandr Siarohin, Kyle Olszewski, Chaoyang Wang, Luc van Gool, Sergey Tulyakov

We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core.

Text-Guided Synthesis of Eulerian Cinemagraphs

1 code implementation6 Jul 2023 Aniruddha Mahapatra, Aliaksandr Siarohin, Hsin-Ying Lee, Sergey Tulyakov, Jun-Yan Zhu

We introduce Text2Cinemagraph, a fully automated method for creating cinemagraphs from text descriptions - an especially challenging task when prompts feature imaginary elements and artistic styles, given the complexity of interpreting the semantics and motions of these images.

Image Animation

Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors

1 code implementation30 Jun 2023 Guocheng Qian, Jinjie Mai, Abdullah Hamdi, Jian Ren, Aliaksandr Siarohin, Bing Li, Hsin-Ying Lee, Ivan Skorokhodov, Peter Wonka, Sergey Tulyakov, Bernard Ghanem

We present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes generation from a single unposed image in the wild using both2D and 3D priors.

Image to 3D

Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion Models

no code implementations23 Mar 2023 Willi Menapace, Aliaksandr Siarohin, Stéphane Lathuilière, Panos Achlioptas, Vladislav Golyanik, Sergey Tulyakov, Elisa Ricci

Most captivatingly, our PGM unlocks the director's mode, where the game is played by specifying goals for the agents in the form of a prompt.


3D generation on ImageNet

no code implementations2 Mar 2023 Ivan Skorokhodov, Aliaksandr Siarohin, Yinghao Xu, Jian Ren, Hsin-Ying Lee, Peter Wonka, Sergey Tulyakov

Existing 3D-from-2D generators are typically designed for well-curated single-category datasets, where all the objects have (approximately) the same scale, 3D location, and orientation, and the camera always points to the center of the scene.

3D Generation

Invertible Neural Skinning

no code implementations CVPR 2023 Yash Kant, Aliaksandr Siarohin, Riza Alp Guler, Menglei Chai, Jian Ren, Sergey Tulyakov, Igor Gilitschenski

Next, we combine PIN with a differentiable LBS module to build an expressive and end-to-end Invertible Neural Skinning (INS) pipeline.

InfiniCity: Infinite-Scale City Synthesis

no code implementations ICCV 2023 Chieh Hubert Lin, Hsin-Ying Lee, Willi Menapace, Menglei Chai, Aliaksandr Siarohin, Ming-Hsuan Yang, Sergey Tulyakov

Toward infinite-scale 3D city synthesis, we propose a novel framework, InfiniCity, which constructs and renders an unconstrainedly large and 3D-grounded environment from random noises.

Image Generation Neural Rendering

3DAvatarGAN: Bridging Domains for Personalized Editable Avatars

no code implementations CVPR 2023 Rameen Abdal, Hsin-Ying Lee, Peihao Zhu, Menglei Chai, Aliaksandr Siarohin, Peter Wonka, Sergey Tulyakov

Finally, we propose a novel inversion method for 3D-GANs linking the latent spaces of the source and the target domains.

Training and Tuning Generative Neural Radiance Fields for Attribute-Conditional 3D-Aware Face Generation

1 code implementation26 Aug 2022 Jichao Zhang, Aliaksandr Siarohin, Yahui Liu, Hao Tang, Nicu Sebe, Wei Wang

Generative Neural Radiance Fields (GNeRF) based 3D-aware GANs have demonstrated remarkable capabilities in generating high-quality images while maintaining strong 3D consistency.

Attribute Disentanglement +2

3D-Aware Semantic-Guided Generative Model for Human Synthesis

1 code implementation2 Dec 2021 Jichao Zhang, Enver Sangineto, Hao Tang, Aliaksandr Siarohin, Zhun Zhong, Nicu Sebe, Wei Wang

However, they usually struggle to generate high-quality images representing non-rigid objects, such as the human body, which is of a great interest for many computer graphics applications.

3D-Aware Image Synthesis

Controllable Person Image Synthesis with Spatially-Adaptive Warped Normalization

1 code implementation31 May 2021 Jichao Zhang, Aliaksandr Siarohin, Hao Tang, Enver Sangineto, Wei Wang, Humphrey Sh, Nicu Sebe

Moreover, we propose a novel Self-Training Part Replacement (STPR) strategy to refine the model for the texture-transfer task, which improves the quality of the generated clothes and the preservation ability of non-target regions.

Image-to-Image Translation Pose Transfer +1

Motion Representations for Articulated Animation

2 code implementations CVPR 2021 Aliaksandr Siarohin, Oliver J. Woodford, Jian Ren, Menglei Chai, Sergey Tulyakov

To facilitate animation and prevent the leakage of the shape of the driving object, we disentangle shape and pose of objects in the region space.

Object Video Reconstruction

Whitening for Self-Supervised Representation Learning

8 code implementations13 Jul 2020 Aleksandr Ermolov, Aliaksandr Siarohin, Enver Sangineto, Nicu Sebe

Most of the current self-supervised representation learning (SSL) methods are based on the contrastive loss and the instance-discrimination task, where augmented versions of the same image instance ("positives") are contrasted with instances extracted from other images ("negatives").

Representation Learning Self-Supervised Learning

DwNet: Dense warp-based network for pose-guided human video generation

2 code implementations21 Oct 2019 Polina Zablotskaia, Aliaksandr Siarohin, Bo Zhao, Leonid Sigal

In this paper, we focus on human motion transfer - generation of a video depicting a particular subject, observed in a single image, performing a series of motions exemplified by an auxiliary (driving) video.

Video Generation

Attention-based Fusion for Multi-source Human Image Generation

no code implementations7 May 2019 Stéphane Lathuilière, Enver Sangineto, Aliaksandr Siarohin, Nicu Sebe

We present a generalization of the person-image generation task, in which a human image is generated conditioned on a target pose and a set X of source appearance images.

Image Generation

Whitening and Coloring transform for GANs

no code implementations ICLR 2019 Aliaksandr Siarohin, Enver Sangineto, Nicu Sebe

In this paper we propose to generalize both BN and cBN using a Whitening and Coloring based batch normalization.

Appearance and Pose-Conditioned Human Image Generation using Deformable GANs

1 code implementation30 Apr 2019 Aliaksandr Siarohin, Stéphane Lathuilière, Enver Sangineto, Nicu Sebe

Specifically, given an image xa of a person and a target pose P(xb), extracted from a different image xb, we synthesize a new image of that person in pose P(xb), while preserving the visual details in xa.

Data Augmentation Generative Adversarial Network +2

Enhancing Perceptual Attributes with Bayesian Style Generation

1 code implementation3 Dec 2018 Aliaksandr Siarohin, Gloria Zen, Nicu Sebe, Elisa Ricci

Our approach takes as input a natural image and exploits recent models for deep style transfer and generative adversarial networks to change its style in order to modify a specific high-level attribute.

Attribute Style Transfer

Whitening and Coloring batch transform for GANs

1 code implementation ICLR 2019 Aliaksandr Siarohin, Enver Sangineto, Nicu Sebe

In this paper we propose to generalize both BN and cBN using a Whitening and Coloring based batch normalization.

Image Generation

How to Make an Image More Memorable? A Deep Style Transfer Approach

1 code implementation6 Apr 2017 Aliaksandr Siarohin, Gloria Zen, Cveta Majtanovic, Xavier Alameda-Pineda, Elisa Ricci, Nicu Sebe

In this work, we show that it is possible to automatically retrieve the best style seeds for a given image, thus remarkably reducing the number of human attempts needed to find a good match.

Image Generation Style Transfer

Cannot find the paper you are looking for? You can Submit a new open access paper.