Search Results for author: Yaser Sheikh

Found 49 papers, 20 papers with code

RelightableHands: Efficient Neural Relighting of Articulated Hand Models

no code implementations CVPR 2023 Shun Iwase, Shunsuke Saito, Tomas Simon, Stephen Lombardi, Timur Bagautdinov, Rohan Joshi, Fabian Prada, Takaaki Shiratori, Yaser Sheikh, Jason Saragih

To achieve generalization, we condition the student model with physics-inspired illumination features such as visibility, diffuse shading, and specular reflections computed on a coarse proxy geometry, maintaining a small computational overhead.

Drivable Volumetric Avatars using Texel-Aligned Features

no code implementations20 Jul 2022 Edoardo Remelli, Timur Bagautdinov, Shunsuke Saito, Tomas Simon, Chenglei Wu, Shih-En Wei, Kaiwen Guo, Zhe Cao, Fabian Prada, Jason Saragih, Yaser Sheikh

To circumvent this, we propose a novel volumetric avatar representation by extending mixtures of volumetric primitives to articulated objects.

Dressing Avatars: Deep Photorealistic Appearance for Physically Simulated Clothing

no code implementations30 Jun 2022 Donglai Xiang, Timur Bagautdinov, Tuur Stuyck, Fabian Prada, Javier Romero, Weipeng Xu, Shunsuke Saito, Jingfan Guo, Breannan Smith, Takaaki Shiratori, Yaser Sheikh, Jessica Hodgins, Chenglei Wu

The key idea is to introduce a neural clothing appearance model that operates on top of explicit geometry: at training time we use high-fidelity tracking, whereas at animation time we rely on physically simulated geometry.

Driving-Signal Aware Full-Body Avatars

no code implementations21 May 2021 Timur Bagautdinov, Chenglei Wu, Tomas Simon, Fabian Prada, Takaaki Shiratori, Shih-En Wei, Weipeng Xu, Yaser Sheikh, Jason Saragih

The core intuition behind our method is that better drivability and generalization can be achieved by disentangling the driving signals and remaining generative factors, which are not available during animation.


MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement

2 code implementations ICCV 2021 Alexander Richard, Michael Zollhoefer, Yandong Wen, Fernando de la Torre, Yaser Sheikh

To improve upon existing models, we propose a generic audio-driven facial animation approach that achieves highly realistic motion synthesis results for the entire face.

3D Face Animation Disentanglement +1

Pixel Codec Avatars

1 code implementation CVPR 2021 Shugao Ma, Tomas Simon, Jason Saragih, Dawei Wang, Yuecheng Li, Fernando de la Torre, Yaser Sheikh

Telecommunication with photorealistic avatars in virtual or augmented reality is a promising path for achieving authentic face-to-face communication in 3D over remote physical distances.

High-fidelity Face Tracking for AR/VR via Deep Lighting Adaptation

no code implementations CVPR 2021 Lele Chen, Chen Cao, Fernando de la Torre, Jason Saragih, Chenliang Xu, Yaser Sheikh

This paper addresses previous limitations by learning a deep learning lighting model, that in combination with a high-quality 3D face tracking algorithm, provides a method for subtle and robust facial motion transfer from a regular video to a 3D photo-realistic avatar.

Vocal Bursts Intensity Prediction

Mixture of Volumetric Primitives for Efficient Neural Rendering

1 code implementation2 Mar 2021 Stephen Lombardi, Tomas Simon, Gabriel Schwartz, Michael Zollhoefer, Yaser Sheikh, Jason Saragih

Real-time rendering and animation of humans is a core function in games, movies, and telepresence applications.

Neural Rendering

Supervision by Registration and Triangulation for Landmark Detection

1 code implementation25 Jan 2021 Xuanyi Dong, Yi Yang, Shih-En Wei, Xinshuo Weng, Yaser Sheikh, Shoou-I Yu

End-to-end training is made possible by differentiable registration and 3D triangulation modules.

Optical Flow Estimation

Neural Synthesis of Binaural Audio

no code implementations ICLR 2021 Alexander Richard, Dejan Markovic, Israel D. Gebru, Steven Krenn, Gladstone Alexander Butler, Fernando Torre, Yaser Sheikh

We present a neural rendering approach for binaural sound synthesis that can produce realistic and spatially accurate binaural sound in realtime.

Neural Rendering

Expressive Telepresence via Modular Codec Avatars

no code implementations ECCV 2020 Hang Chu, Shugao Ma, Fernando de la Torre, Sanja Fidler, Yaser Sheikh

It is important to note that traditional person-specific CAs are learned from few training samples, and typically lack robustness as well as limited expressiveness when transferring facial expressions.

Audio- and Gaze-driven Facial Animation of Codec Avatars

no code implementations11 Aug 2020 Alexander Richard, Colin Lea, Shugao Ma, Juergen Gall, Fernando de la Torre, Yaser Sheikh

Codec Avatars are a recent class of learned, photorealistic face models that accurately represent the geometry and texture of a person in 3D (i. e., for virtual reality), and are almost indistinguishable from video.

Spatiotemporal Bundle Adjustment for Dynamic 3D Human Reconstruction in the Wild

no code implementations24 Jul 2020 Minh Vo, Yaser Sheikh, Srinivasa G. Narasimhan

The triangulation constraint, however, is invalid for moving points captured in multiple unsynchronized videos and bundle adjustment is not designed to estimate the temporal alignment between cameras.

3D Human Reconstruction

4D Visualization of Dynamic Events from Unconstrained Multi-View Videos

no code implementations CVPR 2020 Aayush Bansal, Minh Vo, Yaser Sheikh, Deva Ramanan, Srinivasa Narasimhan

We present a data-driven approach for 4D space-time visualization of dynamic events from videos captured by hand-held multiple cameras.

To React or not to React: End-to-End Visual Pose Forecasting for Personalized Avatar during Dyadic Conversations

3 code implementations5 Oct 2019 Chaitanya Ahuja, Shugao Ma, Louis-Philippe Morency, Yaser Sheikh

In this paper, we introduce a neural architecture named Dyadic Residual-Attention Model (DRAM), which integrates intrapersonal (monadic) and interpersonal (dyadic) dynamics using selective attention to generate sequences of body pose conditioned on audio and body pose of the interlocutor and audio of the human operating the avatar.

Single-Network Whole-Body Pose Estimation

2 code implementations ICCV 2019 Gines Hidalgo, Yaadhav Raaj, Haroon Idrees, Donglai Xiang, Hanbyul Joo, Tomas Simon, Yaser Sheikh

We present the first single-network approach for 2D~whole-body pose estimation, which entails simultaneous localization of body, face, hands, and feet keypoints.

Multi-Task Learning Pose Estimation

Neural Volumes: Learning Dynamic Renderable Volumes from Images

1 code implementation18 Jun 2019 Stephen Lombardi, Tomas Simon, Jason Saragih, Gabriel Schwartz, Andreas Lehrmann, Yaser Sheikh

Modeling and rendering of dynamic scenes is challenging, as natural scenes often contain complex phenomena such as thin structures, evolving topology, translucency, scattering, occlusion, and biological motion.

Shapes and Context: In-the-Wild Image Synthesis & Manipulation

no code implementations CVPR 2019 Aayush Bansal, Yaser Sheikh, Deva Ramanan

We introduce a data-driven approach for interactively synthesizing in-the-wild images from semantic label maps.

Image Generation

Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction in A Triadic Interaction

1 code implementation CVPR 2019 Hanbyul Joo, Tomas Simon, Mina Cikara, Yaser Sheikh

We present a new research task and a dataset to understand human social interactions via computational methods, to ultimately endow machines with the ability to encode and decode a broad channel of social signals humans use.

LBS Autoencoder: Self-supervised Fitting of Articulated Meshes to Point Clouds

no code implementations CVPR 2019 Chun-Liang Li, Tomas Simon, Jason Saragih, Barnabás Póczos, Yaser Sheikh

As input, we take a sequence of point clouds to be registered as well as an artist-rigged mesh, i. e. a template mesh equipped with a linear-blend skinning (LBS) deformation space parameterized by a skeleton hierarchy.

Capture Dense: Markerless Motion Capture Meets Dense Pose Estimation

no code implementations5 Dec 2018 Xiu Li, Yebin Liu, Hanbyul Joo, Qionghai Dai, Yaser Sheikh

Specifically, we first introduce a novel markerless motion capture method that can take advantage of dense parsing capability provided by the dense pose detector.

Human Parsing Markerless Motion Capture +1

Efficient Online Multi-Person 2D Pose Tracking with Recurrent Spatio-Temporal Affinity Fields

no code implementations CVPR 2019 Yaadhav Raaj, Haroon Idrees, Gines Hidalgo, Yaser Sheikh

We present an online approach to efficiently and simultaneously detect and track the 2D pose of multiple people in a video sequence.

Ranked #7 on Pose Tracking on PoseTrack2017 (using extra training data)

Pose Tracking

Recycle-GAN: Unsupervised Video Retargeting

1 code implementation ECCV 2018 Aayush Bansal, Shugao Ma, Deva Ramanan, Yaser Sheikh

We introduce a data-driven approach for unsupervised video retargeting that translates content from one domain to another while preserving the style native to a domain, i. e., if contents of John Oliver's speech were to be transferred to Stephen Colbert, then the generated content/speech should be in Stephen Colbert's style.

Face to Face Translation Translation +1

Deep Appearance Models for Face Rendering

1 code implementation1 Aug 2018 Stephen Lombardi, Jason Saragih, Tomas Simon, Yaser Sheikh

At inference time, we condition the decoding network on the viewpoint of the camera in order to generate the appropriate texture for rendering.

Modeling Facial Geometry Using Compositional VAEs

no code implementations CVPR 2018 Timur Bagautdinov, Chenglei Wu, Jason Saragih, Pascal Fua, Yaser Sheikh

We propose a method for learning non-linear face geometry representations using deep generative models.

Learning Patch Reconstructability for Accelerating Multi-View Stereo

no code implementations CVPR 2018 Alex Poms, Chenglei Wu, Shoou-I Yu, Yaser Sheikh

By prioritizing stereo matching on a subset of patches that are highly reconstructable and also cover the 3D surface, we are able to accelerate MVS with minimal reduction in accuracy and completeness.

Stereo Matching Stereo Matching Hand +1

Self-supervised Multi-view Person Association and Its Applications

no code implementations22 May 2018 Minh Vo, Ersin Yumer, Kalyan Sunkavalli, Sunil Hadap, Yaser Sheikh, Srinivasa Narasimhan

Reliable markerless motion tracking of people participating in a complex group activity from multiple moving cameras is challenging due to frequent occlusions, strong viewpoint and appearance variations, and asynchronous video streams.


Structure from Recurrent Motion: From Rigidity to Recurrency

no code implementations CVPR 2018 Xiu Li, Hongdong Li, Hanbyul Joo, Yebin Liu, Yaser Sheikh

This paper proposes a new method for Non-Rigid Structure-from-Motion (NRSfM) from a long monocular video sequence observing a non-rigid object performing recurrent and possibly repetitive dynamic action.


Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies

no code implementations CVPR 2018 Hanbyul Joo, Tomas Simon, Yaser Sheikh

We present a unified deformation model for the markerless capture of multiple scales of human movement, including facial expressions, body motion, and hand gestures.

Deltille Grids for Geometric Camera Calibration

no code implementations ICCV 2017 Hyowon Ha, Michal Perdoch, Hatem Alismail, In So Kweon, Yaser Sheikh

The recent proliferation of high resolution cameras presents an opportunity to achieve unprecedented levels of precision in visual 3D reconstruction.

3D Reconstruction Camera Calibration

PixelNN: Example-based Image Synthesis

1 code implementation ICLR 2018 Aayush Bansal, Yaser Sheikh, Deva Ramanan

We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges.

Image Generation

Panoptic Studio: A Massively Multiview System for Social Interaction Capture

1 code implementation9 Dec 2016 Hanbyul Joo, Tomas Simon, Xulong Li, Hao liu, Lei Tan, Lin Gui, Sean Banerjee, Timothy Godisart, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei Nobuhara, Yaser Sheikh

The core challenges in capturing social interactions are: (1) occlusion is functional and frequent; (2) subtle motion needs to be measured over a space large enough to host a social group; (3) human appearance and configuration variation is immense; and (4) attaching markers to the body may prime the nature of interactions.

Spatiotemporal Bundle Adjustment for Dynamic 3D Reconstruction

no code implementations CVPR 2016 Minh Vo, Srinivasa G. Narasimhan, Yaser Sheikh

In this paper, we present a spatiotemporal bundle adjustment approach that jointly optimizes four coupled sub-problems: estimating camera intrinsics and extrinsics, triangulating 3D static points, as well as subframe temporal alignment between cameras and estimating 3D trajectories of dynamic points.

3D Reconstruction Dynamic Reconstruction

How useful is photo-realistic rendering for visual learning?

no code implementations26 Mar 2016 Yair Movshovitz-Attias, Takeo Kanade, Yaser Sheikh

Data seems cheap to get, and in many ways it is, but the process of creating a high quality labeled dataset from a mass of data is time-consuming and expensive.

Domain Adaptation Viewpoint Estimation

MAP Visibility Estimation for Large-Scale Dynamic 3D Reconstruction

no code implementations CVPR 2014 Hanbyul Joo, Hyun Soo Park, Yaser Sheikh

Many traditional challenges in reconstructing 3D motion, such as matching across wide baselines and handling occlusion, reduce in significance as the number of unique viewpoints increases.

3D Reconstruction

Tracking Human Pose by Tracking Symmetric Parts

no code implementations CVPR 2013 Varun Ramakrishna, Takeo Kanade, Yaser Sheikh

In this work, we present an occlusion aware algorithm for tracking human pose in an image sequence, that addresses the problem of double counting.

Action Recognition Pose Estimation +1

Nonrigid Structure from Motion in Trajectory Space

no code implementations NeurIPS 2008 Ijaz Akhter, Yaser Sheikh, Sohaib Khan, Takeo Kanade

Existing approaches to nonrigid structure from motion assume that the instantaneous 3D shape of a deforming object is a linear combination of basis shapes, which have to be estimated anew for each video sequence.

Cannot find the paper you are looking for? You can Submit a new open access paper.