Search Results for author: Sergey Tulyakov

Found 47 papers, 25 papers with code

InfiniCity: Infinite-Scale City Synthesis

no code implementations23 Jan 2023 Chieh Hubert Lin, Hsin-Ying Lee, Willi Menapace, Menglei Chai, Aliaksandr Siarohin, Ming-Hsuan Yang, Sergey Tulyakov

Toward infinite-scale 3D city synthesis, we propose a novel framework, InfiniCity, which constructs and renders an unconstrainedly large and 3D-grounded environment from random noises.

Image Generation Neural Rendering

3DAvatarGAN: Bridging Domains for Personalized Editable Avatars

no code implementations6 Jan 2023 Rameen Abdal, Hsin-Ying Lee, Peihao Zhu, Menglei Chai, Aliaksandr Siarohin, Peter Wonka, Sergey Tulyakov

Finally, we propose a novel inversion method for 3D-GANs linking the latent spaces of the source and the target domains.

DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis

no code implementations22 Dec 2022 Yinghao Xu, Menglei Chai, Zifan Shi, Sida Peng, Ivan Skorokhodov, Aliaksandr Siarohin, Ceyuan Yang, Yujun Shen, Hsin-Ying Lee, Bolei Zhou, Sergey Tulyakov

Existing 3D-aware image synthesis approaches mainly focus on generating a single canonical object and show limited capacity in composing a complex scene containing a variety of objects.

3D-Aware Image Synthesis

Real-Time Neural Light Field on Mobile Devices

no code implementations15 Dec 2022 Junli Cao, Huan Wang, Pavlo Chemerys, Vladislav Shakhrai, Ju Hu, Yun Fu, Denys Makoviichuk, Sergey Tulyakov, Jian Ren

Nevertheless, to reach a similar rendering quality as NeRF, the network in NeLF is designed with intensive computation, which is not mobile-friendly.

Neural Rendering Novel View Synthesis

Rethinking Vision Transformers for MobileNet Size and Speed

2 code implementations15 Dec 2022 Yanyu Li, Ju Hu, Yang Wen, Georgios Evangelidis, Kamyar Salahi, Yanzhi Wang, Sergey Tulyakov, Jian Ren

With the success of Vision Transformers (ViTs) in computer vision tasks, recent arts try to optimize the performance and complexity of ViTs to enable efficient deployment on mobile devices.

LADIS: Language Disentanglement for 3D Shape Editing

no code implementations9 Dec 2022 IAn Huang, Panos Achlioptas, Tianyi Zhang, Sergey Tulyakov, Minhyuk Sung, Leonidas Guibas

Additionally, to measure edit locality, we define a new metric that we call part-wise edit precision.

Disentanglement

SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation

1 code implementation8 Dec 2022 Yen-Chi Cheng, Hsin-Ying Lee, Sergey Tulyakov, Alexander Schwing, LiangYan Gui

To enable interactive generation, our method supports a variety of input modalities that can be easily provided by a human, including images, text, partially observed shapes and combinations of these, further allowing to adjust the strength of each input.

3D Reconstruction 3D Shape Generation +2

Make-A-Story: Visual Memory Conditioned Consistent Story Generation

no code implementations23 Nov 2022 Tanzila Rahman, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Shweta Mahajan, Leonid Sigal

Our experiments for story generation on the MUGEN, the PororoSV and the FlintstonesSV dataset show that our method not only outperforms prior state-of-the-art in generating frames with high visual quality, which are consistent with the story, but also models appropriate correspondences between the characters and the background.

Story Generation Story Visualization

Affection: Learning Affective Explanations for Real-World Visual Data

no code implementations4 Oct 2022 Panos Achlioptas, Maks Ovsjanikov, Leonidas Guibas, Sergey Tulyakov

To embark on this journey, we introduce and share with the research community a large-scale dataset that contains emotional reactions and free-form textual explanations for 85, 007 publicly available images, analyzed by 6, 283 annotators who were asked to indicate and explain how and why they felt in a particular way when observing a specific image, producing a total of 526, 749 responses.

Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training

1 code implementation22 Sep 2022 Geng Yuan, Yanyu Li, Sheng Li, Zhenglun Kong, Sergey Tulyakov, Xulong Tang, Yanzhi Wang, Jian Ren

Therefore, we analyze the feasibility and potentiality of using the layer freezing technique in sparse training and find it has the potential to save considerable training costs.

Cross-Modal 3D Shape Generation and Manipulation

no code implementations24 Jul 2022 Zezhou Cheng, Menglei Chai, Jian Ren, Hsin-Ying Lee, Kyle Olszewski, Zeng Huang, Subhransu Maji, Sergey Tulyakov

In this paper, we propose a generic multi-modal generative model that couples the 2D modalities and implicit 3D representations through shared latent spaces.

3D Shape Generation

EpiGRAF: Rethinking training of 3D GANs

1 code implementation21 Jun 2022 Ivan Skorokhodov, Sergey Tulyakov, Yiqun Wang, Peter Wonka

In this work, we show that it is possible to obtain a high-resolution 3D generator with SotA image quality by following a completely different route of simply training the model patch-wise.

3D-Aware Image Synthesis

Discrete Contrastive Diffusion for Cross-Modal and Conditional Generation

1 code implementation15 Jun 2022 Ye Zhu, Yu Wu, Kyle Olszewski, Jian Ren, Sergey Tulyakov, Yan Yan

To this end, we introduce a Conditional Discrete Contrastive Diffusion (CDCD) loss and design two contrastive diffusion mechanisms to effectively incorporate it into the denoising process.

Contrastive Learning Denoising +2

EfficientFormer: Vision Transformers at MobileNet Speed

4 code implementations2 Jun 2022 Yanyu Li, Geng Yuan, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, Jian Ren

Our work proves that properly designed transformers can reach extremely low latency on mobile devices while maintaining high performance.

Control-NeRF: Editable Feature Volumes for Scene Rendering and Manipulation

no code implementations22 Apr 2022 Verica Lazova, Vladimir Guzov, Kyle Olszewski, Sergey Tulyakov, Gerard Pons-Moll

With the aim of obtaining interpretable and controllable scene representations, our model couples learnt scene-specific feature volumes with a scene agnostic neural rendering network.

Neural Rendering Novel View Synthesis

Quantized GAN for Complex Music Generation from Dance Videos

1 code implementation1 Apr 2022 Ye Zhu, Kyle Olszewski, Yu Wu, Panos Achlioptas, Menglei Chai, Yan Yan, Sergey Tulyakov

We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates complex musical samples conditioned on dance videos.

Music Generation

R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis

1 code implementation31 Mar 2022 Huan Wang, Jian Ren, Zeng Huang, Kyle Olszewski, Menglei Chai, Yun Fu, Sergey Tulyakov

On the other hand, Neural Light Field (NeLF) presents a more straightforward representation over NeRF in novel view synthesis -- the rendering of a pixel amounts to one single forward pass without ray-marching.

Novel View Synthesis

Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning

1 code implementation CVPR 2022 Ligong Han, Jian Ren, Hsin-Ying Lee, Francesco Barbieri, Kyle Olszewski, Shervin Minaee, Dimitris Metaxas, Sergey Tulyakov

In addition, our model can extract visual information as suggested by the text prompt, e. g., "an object in image one is moving northeast", and generate corresponding videos.

Self-Learning Text Augmentation +1

F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

1 code implementation ICLR 2022 Qing Jin, Jian Ren, Richard Zhuang, Sumant Hanumante, Zhengang Li, Zhiyu Chen, Yanzhi Wang, Kaiyuan Yang, Sergey Tulyakov

Our approach achieves comparable and better performance, when compared not only to existing quantization techniques with INT32 multiplication or floating-point arithmetic, but also to the full-precision counterparts, achieving state-of-the-art performance.

Quantization

NeROIC: Neural Rendering of Objects from Online Image Collections

1 code implementation7 Jan 2022 Zhengfei Kuang, Kyle Olszewski, Menglei Chai, Zeng Huang, Panos Achlioptas, Sergey Tulyakov

We present a novel method to acquire object representations from online image collections, capturing high-quality geometry and material properties of arbitrary objects from photographs with varying cameras, illumination, and backgrounds.

Neural Rendering Novel View Synthesis

InOut: Diverse Image Outpainting via GAN Inversion

no code implementations CVPR 2022 Yen-Chi Cheng, Chieh Hubert Lin, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Ming-Hsuan Yang

Existing image outpainting methods pose the problem as a conditional image-to-image translation task, often generating repetitive structures and textures by replicating the content available in the input image.

Image Outpainting Image-to-Image Translation

StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2

1 code implementation CVPR 2022 Ivan Skorokhodov, Sergey Tulyakov, Mohamed Elhoseiny

We build our model on top of StyleGAN2 and it is just ${\approx}5\%$ more expensive to train at the same resolution while achieving almost the same image quality.

Motion Representations for Articulated Animation

2 code implementations CVPR 2021 Aliaksandr Siarohin, Oliver J. Woodford, Jian Ren, Menglei Chai, Sergey Tulyakov

To facilitate animation and prevent the leakage of the shape of the driving object, we disentangle shape and pose of objects in the region space.

Video Reconstruction

In&Out : Diverse Image Outpainting via GAN Inversion

no code implementations1 Apr 2021 Yen-Chi Cheng, Chieh Hubert Lin, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Ming-Hsuan Yang

Existing image outpainting methods pose the problem as a conditional image-to-image translation task, often generating repetitive structures and textures by replicating the content available in the input image.

Image Outpainting Image-to-Image Translation +1

SMIL: Multimodal Learning with Severely Missing Modality

1 code implementation9 Mar 2021 Mengmeng Ma, Jian Ren, Long Zhao, Sergey Tulyakov, Cathy Wu, Xi Peng

A common assumption in multimodal learning is the completeness of training data, i. e., full modalities are available in all training examples.

Meta-Learning

Teachers Do More Than Teach: Compressing Image-to-Image Models

1 code implementation CVPR 2021 Qing Jin, Jian Ren, Oliver J. Woodford, Jiazhuo Wang, Geng Yuan, Yanzhi Wang, Sergey Tulyakov

In this work, we aim to address these issues by introducing a teacher network that provides a search space in which efficient network architectures can be found, in addition to performing knowledge distillation.

Knowledge Distillation

MichiGAN: Multi-Input-Conditioned Hair Image Generation for Portrait Editing

1 code implementation30 Oct 2020 Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Lu Yuan, Sergey Tulyakov, Nenghai Yu

In this paper, we present MichiGAN (Multi-Input-Conditioned Hair Image GAN), a novel conditional image generation method for interactive portrait hair manipulation.

Conditional Image Generation

Interactive Video Stylization Using Few-Shot Patch-Based Training

2 code implementations29 Apr 2020 Ondřej Texler, David Futschik, Michal Kučera, Ondřej Jamriška, Šárka Sochorová, Menglei Chai, Sergey Tulyakov, Daniel Sýkora

In this paper, we present a learning-based method to the keyframe-based video stylization that allows an artist to propagate the style from a few selected keyframes to the rest of the sequence.

Style Transfer Translation +1

Neural Hair Rendering

no code implementations ECCV 2020 Menglei Chai, Jian Ren, Sergey Tulyakov

Unlike existing supervised translation methods that require model-level similarity to preserve consistent structure representation for both real images and fake renderings, our method adopts an unsupervised solution to work on arbitrary hair models.

Translation

Human Motion Transfer from Poses in the Wild

no code implementations7 Apr 2020 Jian Ren, Menglei Chai, Sergey Tulyakov, Chen Fang, Xiaohui Shen, Jianchao Yang

In this paper, we tackle the problem of human motion transfer, where we synthesize novel motion video for a target person that imitates the movement from a reference video.

Translation

Motion-supervised Co-Part Segmentation

2 code implementations7 Apr 2020 Aliaksandr Siarohin, Subhankar Roy, Stéphane Lathuilière, Sergey Tulyakov, Elisa Ricci, Nicu Sebe

To overcome this limitation, we propose a self-supervised deep learning method for co-part segmentation.

Task-Assisted Domain Adaptation with Anchor Tasks

no code implementations16 Aug 2019 Zhizhong Li, Linjie Luo, Sergey Tulyakov, Qieyun Dai, Derek Hoiem

Our key idea to improve domain adaptation is to introduce a separate anchor task (such as facial landmarks) whose annotations can be obtained at no cost or are already available on both synthetic and real datasets.

Depth Estimation Domain Adaptation +2

Transformable Bottleneck Networks

1 code implementation ICCV 2019 Kyle Olszewski, Sergey Tulyakov, Oliver Woodford, Hao Li, Linjie Luo

We propose a novel approach to performing fine-grained 3D manipulation of image content via a convolutional neural network, which we call the Transformable Bottleneck Network (TBN).

3D Reconstruction Novel View Synthesis

Train One Get One Free: Partially Supervised Neural Network for Bug Report Duplicate Detection and Clustering

no code implementations NAACL 2019 Lahari Poddar, Leonardo Neves, William Brendel, Luis Marujo, Sergey Tulyakov, Pradeep Karuturi

Leveraging the assumption that learning the topic of a bug is a sub-task for detecting duplicates, we design a loss function that can jointly perform both tasks but needs supervision for only duplicate classification, achieving topic clustering in an unsupervised fashion.

General Classification

3D Guided Fine-Grained Face Manipulation

no code implementations CVPR 2019 Zhenglin Geng, Chen Cao, Sergey Tulyakov

This is achieved by first fitting a 3D face model and then disentangling the face into a texture and a shape.

Face Model

Hybrid VAE: Improving Deep Generative Models using Partial Observations

no code implementations30 Nov 2017 Sergey Tulyakov, Andrew Fitzgibbon, Sebastian Nowozin

We show that such a combination is beneficial because the unlabeled data acts as a data-driven form of regularization, allowing generative models trained on few labeled samples to reach the performance of fully-supervised generative models trained on much larger datasets.

Self-Adaptive Matrix Completion for Heart Rate Estimation From Face Videos Under Realistic Conditions

no code implementations CVPR 2016 Sergey Tulyakov, Xavier Alameda-Pineda, Elisa Ricci, Lijun Yin, Jeffrey F. Cohn, Nicu Sebe

Recent studies in computer vision have shown that, while practically invisible to a human observer, skin color changes due to blood flow can be captured on face videos and, surprisingly, be used to estimate the heart rate (HR).

Heart rate estimation Matrix Completion

Regressing a 3D Face Shape From a Single Image

no code implementations ICCV 2015 Sergey Tulyakov, Nicu Sebe

To support the ability of our method to reliably reconstruct 3D shapes, we introduce a simple method for head pose estimation using a single image that reaches higher accuracy than the state of the art.

Head Pose Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.