1 code implementation • 24 Mar 2025 • Rong Wang, Fabian Prada, Ziyan Wang, Zhongshi Jiang, Chengxiang Yin, Junxuan Li, Shunsuke Saito, Igor Santesteban, Javier Romero, Rohan Joshi, Hongdong Li, Jason Saragih, Yaser Sheikh
We present a novel method for reconstructing personalized 3D human avatars with realistic animation from only a few images.
no code implementations • 3 Mar 2025 • Chen Guo, Junxuan Li, Yash Kant, Yaser Sheikh, Shunsuke Saito, Chen Cao
Once the UPM is learned to accurately reproduce the large-scale multi-view human images, we fine-tune the model with an in-the-wild video via inverse rendering to obtain a personalized photorealistic human avatar that can be faithfully animated to novel human motions and rendered from novel views.
no code implementations • 31 Oct 2024 • Junxuan Li, Chen Cao, Gabriel Schwartz, Rawal Khirodkar, Christian Richardt, Tomas Simon, Yaser Sheikh, Shunsuke Saito
Unlike existing approaches that estimate parametric reflectance parameters via inverse rendering, our approach directly models learnable radiance transfer that incorporates global light transport in an efficient manner for real-time rendering.
no code implementations • 17 Jul 2024 • Shaojie Bai, Te-Li Wang, Chenghui Li, Akshay Venkatesh, Tomas Simon, Chen Cao, Gabriel Schwartz, Ryan Wrench, Jason Saragih, Yaser Sheikh, Shih-En Wei
The oblique and incomplete views of the face, variability in the donning of headsets, and illumination variation due to the environment are some of the unique challenges in generalization to unseen faces.
no code implementations • 3 May 2024 • Stanislav Pidhorskyi, Tomas Simon, Gabriel Schwartz, He Wen, Yaser Sheikh, Jason Saragih
Computing the gradients of a rendering process is paramount for diverse applications in computer vision and graphics.
no code implementations • CVPR 2024 • Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo, Chen Cao, Stanislav Pidhorskyi, Tomas Simon, Rohan Joshi, Yuan Dong, Yichen Xu, Bernardo Pires, He Wen, Lucas Evans, Bo Peng, Julia Buffalini, Autumn Trimble, Kevyn McPhail, Melissa Schoeller, Shoou-I Yu, Javier Romero, Michael Zollhöfer, Yaser Sheikh, Ziwei Liu, Shunsuke Saito
To simplify the personalization process while retaining photorealism, we build a powerful universal relightable prior based on neural relighting from multi-view images of hands captured in a light stage with hundreds of identities.
no code implementations • CVPR 2023 • Shun Iwase, Shunsuke Saito, Tomas Simon, Stephen Lombardi, Timur Bagautdinov, Rohan Joshi, Fabian Prada, Takaaki Shiratori, Yaser Sheikh, Jason Saragih
To achieve generalization, we condition the student model with physics-inspired illumination features such as visibility, diffuse shading, and specular reflections computed on a coarse proxy geometry, maintaining a small computational overhead.
1 code implementation • 22 Jul 2022 • Cheng-hsin Wuu, Ningyuan Zheng, Scott Ardisson, Rohan Bali, Danielle Belko, Eric Brockmeyer, Lucas Evans, Timothy Godisart, Hyowon Ha, Xuhua Huang, Alexander Hypes, Taylor Koska, Steven Krenn, Stephen Lombardi, Xiaomin Luo, Kevyn McPhail, Laura Millerschoen, Michal Perdoch, Mark Pitts, Alexander Richard, Jason Saragih, Junko Saragih, Takaaki Shiratori, Tomas Simon, Matt Stewart, Autumn Trimble, Xinshuo Weng, David Whitewolf, Chenglei Wu, Shoou-I Yu, Yaser Sheikh
Along with the release of the dataset, we conduct ablation studies on the influence of different model architectures toward the model's interpolation capacity of novel viewpoint and expressions.
no code implementations • 20 Jul 2022 • Edoardo Remelli, Timur Bagautdinov, Shunsuke Saito, Tomas Simon, Chenglei Wu, Shih-En Wei, Kaiwen Guo, Zhe Cao, Fabian Prada, Jason Saragih, Yaser Sheikh
To circumvent this, we propose a novel volumetric avatar representation by extending mixtures of volumetric primitives to articulated objects.
no code implementations • 30 Jun 2022 • Donglai Xiang, Timur Bagautdinov, Tuur Stuyck, Fabian Prada, Javier Romero, Weipeng Xu, Shunsuke Saito, Jingfan Guo, Breannan Smith, Takaaki Shiratori, Yaser Sheikh, Jessica Hodgins, Chenglei Wu
The key idea is to introduce a neural clothing appearance model that operates on top of explicit geometry: at training time we use high-fidelity tracking, whereas at animation time we rely on physically simulated geometry.
no code implementations • 7 Jun 2022 • Oshri Halimi, Fabian Prada, Tuur Stuyck, Donglai Xiang, Timur Bagautdinov, He Wen, Ron Kimmel, Takaaki Shiratori, Chenglei Wu, Yaser Sheikh
Here, we propose an end-to-end pipeline for building drivable representations for clothing.
no code implementations • 21 May 2021 • Timur Bagautdinov, Chenglei Wu, Tomas Simon, Fabian Prada, Takaaki Shiratori, Shih-En Wei, Weipeng Xu, Yaser Sheikh, Jason Saragih
The core intuition behind our method is that better drivability and generalization can be achieved by disentangling the driving signals and remaining generative factors, which are not available during animation.
2 code implementations • ICCV 2021 • Alexander Richard, Michael Zollhoefer, Yandong Wen, Fernando de la Torre, Yaser Sheikh
To improve upon existing models, we propose a generic audio-driven facial animation approach that achieves highly realistic motion synthesis results for the entire face.
Ranked #2 on
3D Face Animation
on VOCASET
1 code implementation • CVPR 2021 • Shugao Ma, Tomas Simon, Jason Saragih, Dawei Wang, Yuecheng Li, Fernando de la Torre, Yaser Sheikh
Telecommunication with photorealistic avatars in virtual or augmented reality is a promising path for achieving authentic face-to-face communication in 3D over remote physical distances.
no code implementations • CVPR 2021 • Lele Chen, Chen Cao, Fernando de la Torre, Jason Saragih, Chenliang Xu, Yaser Sheikh
This paper addresses previous limitations by learning a deep learning lighting model, that in combination with a high-quality 3D face tracking algorithm, provides a method for subtle and robust facial motion transfer from a regular video to a 3D photo-realistic avatar.
1 code implementation • 2 Mar 2021 • Stephen Lombardi, Tomas Simon, Gabriel Schwartz, Michael Zollhoefer, Yaser Sheikh, Jason Saragih
Real-time rendering and animation of humans is a core function in games, movies, and telepresence applications.
1 code implementation • 25 Jan 2021 • Xuanyi Dong, Yi Yang, Shih-En Wei, Xinshuo Weng, Yaser Sheikh, Shoou-I Yu
End-to-end training is made possible by differentiable registration and 3D triangulation modules.
no code implementations • ICLR 2021 • Alexander Richard, Dejan Markovic, Israel D. Gebru, Steven Krenn, Gladstone Alexander Butler, Fernando Torre, Yaser Sheikh
We present a neural rendering approach for binaural sound synthesis that can produce realistic and spatially accurate binaural sound in realtime.
no code implementations • ECCV 2020 • Hang Chu, Shugao Ma, Fernando de la Torre, Sanja Fidler, Yaser Sheikh
It is important to note that traditional person-specific CAs are learned from few training samples, and typically lack robustness as well as limited expressiveness when transferring facial expressions.
no code implementations • 11 Aug 2020 • Alexander Richard, Colin Lea, Shugao Ma, Juergen Gall, Fernando de la Torre, Yaser Sheikh
Codec Avatars are a recent class of learned, photorealistic face models that accurately represent the geometry and texture of a person in 3D (i. e., for virtual reality), and are almost indistinguishable from video.
no code implementations • 24 Jul 2020 • Minh Vo, Yaser Sheikh, Srinivasa G. Narasimhan
The triangulation constraint, however, is invalid for moving points captured in multiple unsynchronized videos and bundle adjustment is not designed to estimate the temporal alignment between cameras.
1 code implementation • NeurIPS 2020 • Yi Zhou, Chenglei Wu, Zimo Li, Chen Cao, Yuting Ye, Jason Saragih, Hao Li, Yaser Sheikh
Learning latent representations of registered meshes is useful for many 3D tasks.
no code implementations • CVPR 2020 • Aayush Bansal, Minh Vo, Yaser Sheikh, Deva Ramanan, Srinivasa Narasimhan
We present a data-driven approach for 4D space-time visualization of dynamic events from videos captured by hand-held multiple cameras.
3 code implementations • 5 Oct 2019 • Chaitanya Ahuja, Shugao Ma, Louis-Philippe Morency, Yaser Sheikh
In this paper, we introduce a neural architecture named Dyadic Residual-Attention Model (DRAM), which integrates intrapersonal (monadic) and interpersonal (dyadic) dynamics using selective attention to generate sequences of body pose conditioned on audio and body pose of the interlocutor and audio of the human operating the avatar.
2 code implementations • ICCV 2019 • Gines Hidalgo, Yaadhav Raaj, Haroon Idrees, Donglai Xiang, Hanbyul Joo, Tomas Simon, Yaser Sheikh
We present the first single-network approach for 2D~whole-body pose estimation, which entails simultaneous localization of body, face, hands, and feet keypoints.
1 code implementation • 18 Jun 2019 • Stephen Lombardi, Tomas Simon, Jason Saragih, Gabriel Schwartz, Andreas Lehrmann, Yaser Sheikh
Modeling and rendering of dynamic scenes is challenging, as natural scenes often contain complex phenomena such as thin structures, evolving topology, translucency, scattering, occlusion, and biological motion.
no code implementations • CVPR 2019 • Aayush Bansal, Yaser Sheikh, Deva Ramanan
We introduce a data-driven approach for interactively synthesizing in-the-wild images from semantic label maps.
1 code implementation • CVPR 2019 • Hanbyul Joo, Tomas Simon, Mina Cikara, Yaser Sheikh
We present a new research task and a dataset to understand human social interactions via computational methods, to ultimately endow machines with the ability to encode and decode a broad channel of social signals humans use.
no code implementations • CVPR 2019 • Chun-Liang Li, Tomas Simon, Jason Saragih, Barnabás Póczos, Yaser Sheikh
As input, we take a sequence of point clouds to be registered as well as an artist-rigged mesh, i. e. a template mesh equipped with a linear-blend skinning (LBS) deformation space parameterized by a skeleton hierarchy.
51 code implementations • 18 Dec 2018 • Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, Yaser Sheikh
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
Ranked #4 on
Pose Estimation
on MPII Single Person
no code implementations • 5 Dec 2018 • Xiu Li, Yebin Liu, Hanbyul Joo, Qionghai Dai, Yaser Sheikh
Specifically, we first introduce a novel markerless motion capture method that can take advantage of dense parsing capability provided by the dense pose detector.
1 code implementation • CVPR 2019 • Donglai Xiang, Hanbyul Joo, Yaser Sheikh
We present the first method to capture the 3D total motion of a target person from a monocular view input.
Ranked #29 on
Monocular 3D Human Pose Estimation
on Human3.6M
no code implementations • CVPR 2019 • Yaadhav Raaj, Haroon Idrees, Gines Hidalgo, Yaser Sheikh
We present an online approach to efficiently and simultaneously detect and track the 2D pose of multiple people in a video sequence.
Ranked #7 on
Pose Tracking
on PoseTrack2017
(using extra training data)
1 code implementation • ECCV 2018 • Aayush Bansal, Shugao Ma, Deva Ramanan, Yaser Sheikh
We introduce a data-driven approach for unsupervised video retargeting that translates content from one domain to another while preserving the style native to a domain, i. e., if contents of John Oliver's speech were to be transferred to Stephen Colbert, then the generated content/speech should be in Stephen Colbert's style.
1 code implementation • 1 Aug 2018 • Stephen Lombardi, Jason Saragih, Tomas Simon, Yaser Sheikh
At inference time, we condition the decoding network on the viewpoint of the camera in order to generate the appropriate texture for rendering.
1 code implementation • CVPR 2018 • Xuanyi Dong, Shoou-I Yu, Xinshuo Weng, Shih-En Wei, Yi Yang, Yaser Sheikh
In this paper, we present supervision-by-registration, an unsupervised approach to improve the precision of facial landmark detectors on both images and video.
Ranked #1 on
Facial Landmark Detection
on 300-VW (C)
no code implementations • CVPR 2018 • Timur Bagautdinov, Chenglei Wu, Jason Saragih, Pascal Fua, Yaser Sheikh
We propose a method for learning non-linear face geometry representations using deep generative models.
no code implementations • CVPR 2018 • Alex Poms, Chenglei Wu, Shoou-I Yu, Yaser Sheikh
By prioritizing stereo matching on a subset of patches that are highly reconstructable and also cover the 3D surface, we are able to accelerate MVS with minimal reduction in accuracy and completeness.
no code implementations • 22 May 2018 • Minh Vo, Ersin Yumer, Kalyan Sunkavalli, Sunil Hadap, Yaser Sheikh, Srinivasa Narasimhan
Reliable markerless motion tracking of people participating in a complex group activity from multiple moving cameras is challenging due to frequent occlusions, strong viewpoint and appearance variations, and asynchronous video streams.
no code implementations • CVPR 2018 • Xiu Li, Hongdong Li, Hanbyul Joo, Yebin Liu, Yaser Sheikh
This paper proposes a new method for Non-Rigid Structure-from-Motion (NRSfM) from a long monocular video sequence observing a non-rigid object performing recurrent and possibly repetitive dynamic action.
no code implementations • CVPR 2018 • Hanbyul Joo, Tomas Simon, Yaser Sheikh
We present a unified deformation model for the markerless capture of multiple scales of human movement, including facial expressions, body motion, and hand gestures.
no code implementations • ICCV 2017 • Hyowon Ha, Michal Perdoch, Hatem Alismail, In So Kweon, Yaser Sheikh
The recent proliferation of high resolution cameras presents an opportunity to achieve unprecedented levels of precision in visual 3D reconstruction.
1 code implementation • ICLR 2018 • Aayush Bansal, Yaser Sheikh, Deva Ramanan
We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges.
39 code implementations • CVPR 2017 • Tomas Simon, Hanbyul Joo, Iain Matthews, Yaser Sheikh
The method is used to train a hand keypoint detector for single images.
2 code implementations • 9 Dec 2016 • Hanbyul Joo, Tomas Simon, Xulong Li, Hao liu, Lei Tan, Lin Gui, Sean Banerjee, Timothy Godisart, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei Nobuhara, Yaser Sheikh
The core challenges in capturing social interactions are: (1) occlusion is functional and frequent; (2) subtle motion needs to be measured over a space large enough to host a social group; (3) human appearance and configuration variation is immense; and (4) attaching markers to the body may prime the nature of interactions.
61 code implementations • CVPR 2017 • Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh
We present an approach to efficiently detect the 2D pose of multiple people in an image.
Ranked #1 on
Multi-Person Pose Estimation
on COCO
no code implementations • CVPR 2016 • Minh Vo, Srinivasa G. Narasimhan, Yaser Sheikh
In this paper, we present a spatiotemporal bundle adjustment approach that jointly optimizes four coupled sub-problems: estimating camera intrinsics and extrinsics, triangulating 3D static points, as well as subframe temporal alignment between cameras and estimating 3D trajectories of dynamic points.
no code implementations • 26 Mar 2016 • Yair Movshovitz-Attias, Takeo Kanade, Yaser Sheikh
Data seems cheap to get, and in many ways it is, but the process of creating a high quality labeled dataset from a mass of data is time-consuming and expensive.
50 code implementations • CVPR 2016 • Shih-En Wei, Varun Ramakrishna, Takeo Kanade, Yaser Sheikh
Pose Machines provide a sequential prediction framework for learning rich implicit spatial models.
Ranked #2 on
Classification
on RSSCN7
no code implementations • ICCV 2015 • Paulo F. U. Gotardo, Tomas Simon, Yaser Sheikh, Iain Matthews
This paper proposes photogeometric scene flow (PGSF) for high-quality dynamic 3D reconstruction.
no code implementations • ICCV 2015 • Hanbyul Joo, Hao liu, Lei Tan, Lin Gui, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei Nobuhara, Yaser Sheikh
We present an approach to capture the 3D structure and motion of a group of people engaged in a social interaction.
no code implementations • CVPR 2014 • Hanbyul Joo, Hyun Soo Park, Yaser Sheikh
Many traditional challenges in reconstructing 3D motion, such as matching across wide baselines and handling occlusion, reduce in significance as the number of unique viewpoints increases.
no code implementations • CVPR 2013 • Varun Ramakrishna, Takeo Kanade, Yaser Sheikh
In this work, we present an occlusion aware algorithm for tracking human pose in an image sequence, that addresses the problem of double counting.
no code implementations • CVPR 2013 • Patrick Lucey, Alina Bialkowski, Peter Carr, Stuart Morgan, Iain Matthews, Yaser Sheikh
In this paper, we describe a method to represent and discover adversarial group behavior in a continuous domain.
no code implementations • NeurIPS 2008 • Ijaz Akhter, Yaser Sheikh, Sohaib Khan, Takeo Kanade
Existing approaches to nonrigid structure from motion assume that the instantaneous 3D shape of a deforming object is a linear combination of basis shapes, which have to be estimated anew for each video sequence.