no code implementations • 17 Oct 2024 • Patrick Kwon, Hanbyul Joo
In this paper, we propose GraspDiffusion, a novel generative method that creates realistic scenes of human-object interaction.
no code implementations • CVPR 2024 • Inhee Lee, Byungjun Kim, Hanbyul Joo
In this paper, we present a method to reconstruct the world and multiple dynamic humans in 3D from a monocular video input.
no code implementations • CVPR 2024 • Hyunsoo Cha, Byungjun Kim, Hanbyul Joo
We present PEGASUS, a method for constructing a personalized generative 3D face avatar from monocular video sources.
no code implementations • CVPR 2024 • Taeksoo Kim, Byungjun Kim, Shunsuke Saito, Hanbyul Joo
Through a series of decomposition steps, we obtain multiple layers of 3D assets in a shared canonical space normalized in terms of poses and human shapes, hence supporting effortless composition to novel identities and reanimation with novel poses.
1 code implementation • 23 Jan 2024 • Hyeonwoo Kim, Sookwan Han, Patrick Kwon, Hanbyul Joo
To construct the distribution, we present a novel pipeline that synthesizes diverse and realistic 3D HOI samples given any 3D object mesh.
no code implementations • 18 Jan 2024 • Jeonghwan Kim, Jisoo Kim, Jeonghyeon Na, Hanbyul Joo
To address this challenge, we introduce the ParaHome system, designed to capture and parameterize dynamic 3D movements of humans and objects within a common home environment.
no code implementations • CVPR 2024 • Jiye Lee, Hanbyul Joo
We present a lightweight and affordable motion capture method based on two smartwatches and a head-mounted camera.
no code implementations • 23 Aug 2023 • Sookwan Han, Hanbyul Joo
We present multiple strategies to leverage the synthesized images, including (1) the first method to leverage a generative image model for 3D human-object spatial relation learning; (2) a framework to reason about the 3D spatial relations from inconsistent 2D cues in a self-supervised manner via 3D occupancy reasoning with pose canonicalization; (3) semantic clustering to disambiguate different types of interactions with the same object types; and (4) a novel metric to assess the quality of 3D spatial learning of interaction.
1 code implementation • ICCV 2023 • Taeksoo Kim, Shunsuke Saito, Hanbyul Joo
Our compositional model is interaction-aware, meaning the spatial relationship between humans and objects, and the mutual shape change by physical contact is fully incorporated.
1 code implementation • ICCV 2023 • Byungjun Kim, Patrick Kwon, Kwangho Lee, Myunggi Lee, Sookwan Han, Daesik Kim, Hanbyul Joo
We propose a 3D generation pipeline that uses diffusion models to generate realistic human digital avatars.
no code implementations • ICCV 2023 • Jiye Lee, Hanbyul Joo
The key motivation of LAMA is to build a unified framework to encompass a series of everyday motions including locomotion, scene interaction, and object manipulation.
no code implementations • ICCV 2023 • Sookwan Han, Hanbyul Joo
We present multiple strategies to leverage the synthesized images, including (1) the first method to leverage a generative image model for 3D human-object spatial relation learning; (2) a framework to reason about the 3D spatial relations from inconsistent 2D cues in a self-supervised manner via 3D occupancy reasoning with pose canonicalization; (3) semantic clustering to disambiguate different types of interactions with the same object types; and (4) a novel metric to assess the quality of 3D spatial learning of interaction.
no code implementations • CVPR 2022 • Evonne Ng, Hanbyul Joo, Liwen Hu, Hao Li, Trevor Darrell, Angjoo Kanazawa, Shiry Ginosar
We present a framework for modeling interactional communication in dyadic conversations: given multimodal inputs of a speaker, we autoregressively output multiple possibilities of corresponding listener motion.
1 code implementation • CVPR 2022 • Gengshan Yang, Minh Vo, Natalia Neverova, Deva Ramanan, Andrea Vedaldi, Hanbyul Joo
Our key insight is to merge three schools of thought; (1) classic deformable shape models that make use of articulated bones and blend skinning, (2) volumetric neural radiance fields (NeRFs) that are amenable to gradient-based optimization, and (3) canonical embeddings that generate correspondences between pixels and an articulated model.
no code implementations • 2 Dec 2021 • Yingdong Qian, Marta Kryven, Tao Gao, Hanbyul Joo, Josh Tenenbaum
We describe Generative Body Kinematics model, which predicts human intention inference in this domain using Bayesian inverse planning and inverse body kinematics.
8 code implementations • CVPR 2022 • Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.
2 code implementations • 19 Aug 2021 • Xiang Xu, Hanbyul Joo, Greg Mori, Manolis Savva
We evaluate this approach on our dataset, demonstrating that human-object relations can significantly reduce the ambiguity of articulated object reconstructions from challenging real-world videos.
1 code implementation • 13 Aug 2021 • Yu Rong, Takaaki Shiratori, Hanbyul Joo
Most existing monocular 3D pose estimation approaches only focus on a single body part, neglecting the fact that the essential nuance of human motion is conveyed through a concert of subtle movements of face, hands, and body.
no code implementations • NeurIPS 2020 • Benjamin Biggs, Sébastien Ehrhadt, Hanbyul Joo, Benjamin Graham, Andrea Vedaldi, David Novotny
We consider the problem of obtaining dense 3D reconstructions of humans from single and partially occluded views.
Ranked #3 on Multi-Hypotheses 3D Human Pose Estimation on AH36M
1 code implementation • 19 Aug 2020 • Yu Rong, Takaaki Shiratori, Hanbyul Joo
To construct FrankMocap, we build the state-of-the-art monocular 3D "hand" motion capture method by taking the hand part of the whole body parametric model (SMPL-X).
2 code implementations • ECCV 2020 • Jason Y. Zhang, Sam Pepose, Hanbyul Joo, Deva Ramanan, Jitendra Malik, Angjoo Kanazawa
We present a method that infers spatial arrangements and shapes of humans and objects in a globally consistent 3D scene, all from a single image in-the-wild captured in an uncontrolled environment.
Ranked #3 on 3D Object Reconstruction on BEHAVE
1 code implementation • CVPR 2021 • Evonne Ng, Shiry Ginosar, Trevor Darrell, Hanbyul Joo
We demonstrate the efficacy of our method on hand gesture synthesis from body motion input, and as a strong body prior for single-view image-based 3D hand pose estimation.
1 code implementation • 7 Apr 2020 • Hanbyul Joo, Natalia Neverova, Andrea Vedaldi
Remarkably, the resulting annotations are sufficient to train from scratch 3D pose regressor networks that outperform the current state-of-the-art on in-the-wild benchmarks such as 3DPW.
Ranked #24 on 3D Human Pose Estimation on MPI-INF-3DHP (PA-MPJPE metric)
3 code implementations • CVPR 2020 • Shunsuke Saito, Tomas Simon, Jason Saragih, Hanbyul Joo
Although current approaches have demonstrated the potential in real world settings, they still fail to produce reconstructions with the level of detail often present in the input images.
Ranked #1 on 3D Object Reconstruction From A Single Image on BUFF
2 code implementations • ICCV 2019 • Gines Hidalgo, Yaadhav Raaj, Haroon Idrees, Donglai Xiang, Hanbyul Joo, Tomas Simon, Yaser Sheikh
We present the first single-network approach for 2D~whole-body pose estimation, which entails simultaneous localization of body, face, hands, and feet keypoints.
1 code implementation • CVPR 2019 • Hanbyul Joo, Tomas Simon, Mina Cikara, Yaser Sheikh
We present a new research task and a dataset to understand human social interactions via computational methods, to ultimately endow machines with the ability to encode and decode a broad channel of social signals humans use.
1 code implementation • CVPR 2020 • Evonne Ng, Donglai Xiang, Hanbyul Joo, Kristen Grauman
The body pose of a person wearing a camera is of great interest for applications in augmented reality, healthcare, and robotics, yet much of the person's body is out of view for a typical wearable camera.
no code implementations • 5 Dec 2018 • Xiu Li, Yebin Liu, Hanbyul Joo, Qionghai Dai, Yaser Sheikh
Specifically, we first introduce a novel markerless motion capture method that can take advantage of dense parsing capability provided by the dense pose detector.
1 code implementation • CVPR 2019 • Donglai Xiang, Hanbyul Joo, Yaser Sheikh
We present the first method to capture the 3D total motion of a target person from a monocular view input.
Ranked #28 on Monocular 3D Human Pose Estimation on Human3.6M
no code implementations • CVPR 2018 • Xiu Li, Hongdong Li, Hanbyul Joo, Yebin Liu, Yaser Sheikh
This paper proposes a new method for Non-Rigid Structure-from-Motion (NRSfM) from a long monocular video sequence observing a non-rigid object performing recurrent and possibly repetitive dynamic action.
no code implementations • CVPR 2018 • Hanbyul Joo, Tomas Simon, Yaser Sheikh
We present a unified deformation model for the markerless capture of multiple scales of human movement, including facial expressions, body motion, and hand gestures.
39 code implementations • CVPR 2017 • Tomas Simon, Hanbyul Joo, Iain Matthews, Yaser Sheikh
The method is used to train a hand keypoint detector for single images.
1 code implementation • 9 Dec 2016 • Hanbyul Joo, Tomas Simon, Xulong Li, Hao liu, Lei Tan, Lin Gui, Sean Banerjee, Timothy Godisart, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei Nobuhara, Yaser Sheikh
The core challenges in capturing social interactions are: (1) occlusion is functional and frequent; (2) subtle motion needs to be measured over a space large enough to host a social group; (3) human appearance and configuration variation is immense; and (4) attaching markers to the body may prime the nature of interactions.
no code implementations • ICCV 2015 • Hanbyul Joo, Hao liu, Lei Tan, Lin Gui, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei Nobuhara, Yaser Sheikh
We present an approach to capture the 3D structure and motion of a group of people engaged in a social interaction.
no code implementations • CVPR 2014 • Hanbyul Joo, Hyun Soo Park, Yaser Sheikh
Many traditional challenges in reconstructing 3D motion, such as matching across wide baselines and handling occlusion, reduce in significance as the number of unique viewpoints increases.