no code implementations • 24 Jan 2025 • Shaofei Wang, Tomas Simon, Igor Santesteban, Timur Bagautdinov, Junxuan Li, Vasu Agrawal, Fabian Prada, Shoou-I Yu, Pace Nalbone, Matt Gramlich, Roman Lubachersky, Chenglei Wu, Javier Romero, Jason Saragih, Michael Zollhoefer, Andreas Geiger, Siyu Tang, Shunsuke Saito
This allows us to learn diffuse radiance transfer in a local coordinate frame, which disentangles the local radiance transfer from the articulation of the body.
1 code implementation • 10 Nov 2024 • Yutong Chen, Marko Mihajlovic, Xiyi Chen, Yiming Wang, Sergey Prokudin, Siyu Tang
To our knowledge, this is the first successful application of point transformers directly on 3DGS sets, surpassing the limitations of previous multi-scene training methods, which could handle only a restricted number of input views during inference.
no code implementations • 7 Oct 2024 • Kaifeng Zhao, Gen Li, Siyu Tang
Additionally, the learned motion primitive space allows for precise spatial motion control, which we formulate either as a latent noise optimization problem or as a Markov decision process addressed through reinforcement learning.
no code implementations • 30 Sep 2024 • Deheng Zhang, Jingyu Wang, Shaofei Wang, Marko Mihajlovic, Sergey Prokudin, Hendrik P. A. Lensch, Siyu Tang
Our experiments demonstrate that our algorithm achieves state-of-the-art performance in inverse rendering and relighting, with particularly strong results in the reconstruction of highly reflective objects.
1 code implementation • 17 Sep 2024 • Marko Mihajlovic, Sergey Prokudin, Siyu Tang, Robert Maier, Federica Bogo, Tony Tung, Edmond Boyer
Digitizing 3D static scenes and 4D dynamic events from multi-view images has long been a challenge in computer vision and graphics.
no code implementations • 29 Jul 2024 • Yuanwen Yue, Anurag Das, Francis Engelmann, Siyu Tang, Jan Eric Lenssen
In this work, we show that fine-tuning on 3D-aware data improves the quality of emerging semantic features.
1 code implementation • 12 Jul 2024 • Yiming Wang, Siyu Tang, Mengyu Chu
We delve into the physics-informed neural reconstruction of smoke and obstacles through sparse-view RGB videos, tackling challenges arising from limited observation of complex dynamics.
1 code implementation • 5 Jul 2024 • Anpei Chen, Haofei Xu, Stefano Esposito, Siyu Tang, Andreas Geiger
Radiance field methods have achieved photorealistic novel view synthesis and geometry reconstruction.
no code implementations • CVPR 2024 • Yan Zhang, Sergey Prokudin, Marko Mihajlovic, Qianli Ma, Siyu Tang
By observing a set of point trajectories, we aim to learn an implicit motion field parameterized by a neural network to predict the movement of novel points within the same domain, without relying on any data-driven or scene-specific priors.
1 code implementation • 16 Apr 2024 • Yiqian Wu, Hao Xu, Xiangjun Tang, Xien Chen, Siyu Tang, Zhebin Zhang, Chen Li, Xiaogang Jin
Existing neural rendering-based text-to-3D-portrait generation methods typically make use of human geometry prior and diffusion models to obtain guidance.
no code implementations • 25 Mar 2024 • Jonas Hein, Frédéric Giraud, Lilian Calvet, Alexander Schwarz, Nicola Alessandro Cavalcanti, Sergey Prokudin, Mazda Farshad, Siyu Tang, Marc Pollefeys, Fabio Carrillo, Philipp Fürnstahl
Surgery digitalization is the process of creating a virtual replica of real-world surgery, also referred to as a surgical digital twin (SDT).
no code implementations • 23 Feb 2024 • Francis Engelmann, Ayca Takmaz, Jonas Schult, Elisabetta Fedele, Johanna Wald, Songyou Peng, Xi Wang, Or Litany, Siyu Tang, Federico Tombari, Marc Pollefeys, Leonidas Guibas, Hongbo Tian, Chunjie Wang, Xiaosheng Yan, Bingwen Wang, Xuanyang Zhang, Xiao Liu, Phuc Nguyen, Khoi Nguyen, Anh Tran, Cuong Pham, Zhening Huang, Xiaoyang Wu, Xi Chen, Hengshuang Zhao, Lei Zhu, Joan Lasenby
This report provides an overview of the challenge hosted at the OpenSUN3D Workshop on Open-Vocabulary 3D Scene Understanding held in conjunction with ICCV 2023.
no code implementations • 15 Feb 2024 • Theodora Kontogianni, Yuanwen Yue, Siyu Tang, Konrad Schindler
Our paper aims to initiate a paradigm shift, advocating for the adoption of continual learning methods through new experimental protocols that better emulate real-world conditions to facilitate breakthroughs in the field.
no code implementations • CVPR 2024 • Gen Li, Kaifeng Zhao, Siwei Zhang, Xiaozhong Lyu, Mihai Dusmanu, Yan Zhang, Marc Pollefeys, Siyu Tang
To address this challenge, we introduce EgoGen, a new synthetic data generator that can produce accurate and rich ground-truth training data for egocentric perception tasks.
1 code implementation • CVPR 2024 • Siwei Zhang, Bharat Lal Bhatnagar, Yuanlu Xu, Alexander Winkler, Petr Kadlecek, Siyu Tang, Federica Bogo
We apply RoHM to a variety of tasks -- from motion reconstruction and denoising to spatial and temporal infilling.
1 code implementation • CVPR 2024 • Xiyi Chen, Marko Mihajlovic, Shaofei Wang, Sergey Prokudin, Siyu Tang
To the best of our knowledge, our proposed framework is the first diffusion model to enable the creation of fully 3D-consistent, animatable, and photorealistic human avatars from a single image of an unseen subject; extensive quantitative and qualitative evaluations demonstrate the advantages of our approach over existing state-of-the-art avatar creation models on both novel view and novel expression synthesis tasks.
no code implementations • CVPR 2024 • Korrawe Karunratanakul, Konpat Preechakul, Emre Aksan, Thabo Beeler, Supasorn Suwajanakorn, Siyu Tang
We propose Diffusion Noise Optimization (DNO), a new method that effectively leverages existing motion diffusion models as motion priors for a wide range of motion-related tasks.
1 code implementation • CVPR 2024 • Zhiyin Qian, Shaofei Wang, Marko Mihajlovic, Andreas Geiger, Siyu Tang
In this paper, we use 3D Gaussian Splatting and learn a non-rigid deformation network to reconstruct animatable clothed human avatars that can be trained within 30 minutes and rendered at real-time frame rates (50+ FPS).
1 code implementation • CVPR 2024 • Shaofei Wang, Božidar Antić, Andreas Geiger, Siyu Tang
We present IntrinsicAvatar, a novel approach to recovering the intrinsic properties of clothed human avatars including geometry, albedo, material, and environment lighting from only monocular videos.
1 code implementation • 6 Sep 2023 • Marko Mihajlovic, Sergey Prokudin, Marc Pollefeys, Siyu Tang
Neural fields, a category of neural networks trained to represent high-frequency signals, have gained significant attention in recent years due to their impressive performance in modeling complex 3D data, such as signed distance (SDFs) or radiance fields (NeRFs), via a single multi-layer perceptron (MLP).
no code implementations • ICCV 2023 • Kaifeng Zhao, Yan Zhang, Shaofei Wang, Thabo Beeler, Siyu Tang
We present a novel method for populating 3D indoor scenes with virtual humans that can navigate in the environment and interact with objects in a realistic manner.
no code implementations • ICCV 2023 • Korrawe Karunratanakul, Konpat Preechakul, Supasorn Suwajanakorn, Siyu Tang
Denoising diffusion models have shown great promise in human motion synthesis conditioned on natural language descriptions.
1 code implementation • ICCV 2023 • Siwei Zhang, Qianli Ma, Yan Zhang, Sadegh Aliakbarian, Darren Cosker, Siyu Tang
One of the biggest challenges of this task is severe body truncation due to close social distances in egocentric scenarios, which brings large pose ambiguities for unseen body parts.
1 code implementation • ICCV 2023 • Sergey Prokudin, Qianli Ma, Maxime Raafat, Julien Valentin, Siyu Tang
In this work, we present a dynamic point field model that combines the representational benefits of explicit point-based graphics with implicit deformation networks to allow efficient modeling of non-rigid 3D surfaces.
1 code implementation • 2 Feb 2023 • Anpei Chen, Zexiang Xu, Xinyue Wei, Siyu Tang, Hao Su, Andreas Geiger
Our experiments show that DiF leads to improvements in approximation quality, compactness, and training time when compared to previous fast reconstruction methods.
no code implementations • CVPR 2023 • Korrawe Karunratanakul, Sergey Prokudin, Otmar Hilliges, Siyu Tang
We present HARP (HAnd Reconstruction and Personalization), a personalized hand avatar creation approach that takes a short monocular RGB video of a human hand as input and reconstructs a faithful hand avatar exhibiting a high-fidelity appearance and geometry.
no code implementations • ICCV 2023 • Ayça Takmaz, Jonas Schult, Irem Kaftan, Mertcan Akçay, Bastian Leibe, Robert Sumner, Francis Engelmann, Siyu Tang
We address this challenge and propose a framework for generating training data of synthetic humans interacting with real 3D scenes.
no code implementations • 18 Oct 2022 • Shaofei Wang, Katja Schwarz, Andreas Geiger, Siyu Tang
We demonstrate that our proposed pipeline can generate clothed avatars with high-quality pose-dependent geometry and appearance from a sparse set of multi-view RGB videos.
1 code implementation • 6 Oct 2022 • Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe
Modern 3D semantic instance segmentation approaches predominantly rely on specialized voting mechanisms followed by carefully designed geometric clustering techniques.
Ranked #2 on
3D Instance Segmentation
on STPLS3D
3D Instance Segmentation
3D Semantic Instance Segmentation
+2
no code implementations • 14 Sep 2022 • Qianli Ma, Jinlong Yang, Michael J. Black, Siyu Tang
Specifically, we extend point-based methods with a coarse stage, that replaces canonicalization with a learned pose-independent "coarse shape" that can capture the rough surface geometry of clothing like skirts.
1 code implementation • 26 Jul 2022 • Kaifeng Zhao, Shaofei Wang, Yan Zhang, Thabo Beeler, Siyu Tang
Furthermore, inspired by the compositional nature of interactions that humans can simultaneously interact with multiple objects, we define interaction semantics as the composition of varying numbers of atomic action-object pairs.
1 code implementation • CVPR 2022 • Vasileios Choutas, Lea Muller, Chun-Hao P. Huang, Siyu Tang, Dimitrios Tzionas, Michael J. Black
Since paired data with images and 3D body shape are rare, we exploit two sources of information: (1) we collect internet images of diverse "fashion" models together with a small set of anthropometric measurements; (2) we collect linguistic shape attributes for a wide range of 3D body meshes and the model images.
Ranked #6 on
3D Human Shape Estimation
on SSP-3D
1 code implementation • 10 May 2022 • Marko Mihajlovic, Aayush Bansal, Michael Zollhoefer, Siyu Tang, Shunsuke Saito
In this work, we investigate common issues with existing spatial encodings and propose a simple yet highly effective approach to modeling high-fidelity volumetric humans from sparse views.
Ranked #2 on
Generalizable Novel View Synthesis
on ZJU-MoCap
1 code implementation • CVPR 2022 • Taein Kwon, Bugra Tekin, Siyu Tang, Marc Pollefeys
Temporal alignment of fine-grained human actions in videos is important for numerous applications in computer vision, robotics, and mixed reality.
1 code implementation • 14 Apr 2022 • Theodora Kontogianni, Ekin Celikkan, Siyu Tang, Konrad Schindler
We propose an interactive approach for 3D instance segmentation, where users can iteratively collaborate with a deep learning model to segment objects in a 3D point cloud directly.
Ranked #1 on
Interactive 3D Instance Segmentation -Trained on Scannet40 - Evaluated on Scannet40
on ScanNetV2
Image Segmentation
Interactive 3D Instance Segmentation -Trained on Scannet40 - Evaluated on Scannet40
+4
1 code implementation • CVPR 2022 • Marko Mihajlovic, Shunsuke Saito, Aayush Bansal, Michael Zollhoefer, Siyu Tang
We present a novel neural implicit representation for articulated human bodies.
1 code implementation • CVPR 2022 • Hongwei Yi, Chun-Hao P. Huang, Dimitrios Tzionas, Muhammed Kocabas, Mohamed Hassan, Siyu Tang, Justus Thies, Michael J. Black
In fact, we demonstrate that these human-scene interactions (HSIs) can be leveraged to improve the 3D reconstruction of a scene from a monocular RGB video.
1 code implementation • 19 Dec 2021 • Yan Wu, Jiahao Wang, Yan Zhang, Siwei Zhang, Otmar Hilliges, Fisher Yu, Siyu Tang
Given an initial pose and the generated whole-body grasping pose as the start and end of the motion respectively, we design a novel contact-aware generative motion infilling module to generate a diverse set of grasp-oriented motions.
no code implementations • CVPR 2022 • Yan Zhang, Siyu Tang
In our solution, we decompose the long-term motion into a time sequence of motion primitives.
1 code implementation • 14 Dec 2021 • Siwei Zhang, Qianli Ma, Yan Zhang, Zhiyin Qian, Taein Kwon, Marc Pollefeys, Federica Bogo, Siyu Tang
Key to reasoning about interactions is to understand the body pose and motion of the interaction partner from the egocentric view.
no code implementations • 23 Sep 2021 • Korrawe Karunratanakul, Adrian Spurr, Zicong Fan, Otmar Hilliges, Siyu Tang
We present Hand ArticuLated Occupancy (HALO), a novel representation of articulated hands that bridges the advantages of 3D keypoints and neural implicit surfaces and can be used in end-to-end trainable architectures.
no code implementations • ICCV 2021 • Qianli Ma, Jinlong Yang, Siyu Tang, Michael J. Black
The geometry feature can be optimized to fit a previously unseen scan of a person in clothing, enabling the scan to be reposed realistically.
1 code implementation • ICCV 2021 • Siwei Zhang, Yan Zhang, Federica Bogo, Marc Pollefeys, Siyu Tang
To prove the effectiveness of the proposed motion priors, we combine them into a novel pipeline for 4D human body capture in 3D scenes.
1 code implementation • 1 Jul 2021 • Zicong Fan, Adrian Spurr, Muhammed Kocabas, Siyu Tang, Michael J. Black, Otmar Hilliges
In natural conversation and interaction, our hands often overlap or are in contact with each other.
Ranked #7 on
3D Interacting Hand Pose Estimation
on InterHand2.6M
1 code implementation • NeurIPS 2021 • Shaofei Wang, Marko Mihajlovic, Qianli Ma, Andreas Geiger, Siyu Tang
In contrast, we propose an approach that can quickly generate realistic clothed human avatars, represented as controllable neural SDFs, given only monocular depth images.
no code implementations • CVPR 2021 • Shaofei Wang, Andreas Geiger, Siyu Tang
We combine PTF with multi-class occupancy networks, obtaining a novel learning-based framework that learns to simultaneously predict shape and per-point correspondences between the posed space and the canonical space for clothed human.
1 code implementation • CVPR 2021 • Qianli Ma, Shunsuke Saito, Jinlong Yang, Siyu Tang, Michael J. Black
We demonstrate the efficacy of our surface representation by learning models of complex clothing from point clouds.
1 code implementation • CVPR 2021 • Marko Mihajlovic, Yan Zhang, Michael J. Black, Siyu Tang
Substantial progress has been made on modeling rigid 3D objects using deep implicit representations.
1 code implementation • CVPR 2021 • Lea Müller, Ahmed A. A. Osman, Siyu Tang, Chun-Hao P. Huang, Michael J. Black
Third, we develop a novel HPS optimization method, SMPLify-XMC, that includes contact constraints and uses the known 3DCP body pose during fitting to create near ground-truth poses for MTP images.
Ranked #82 on
3D Human Pose Estimation
on 3DPW
(MPJPE metric)
no code implementations • CVPR 2021 • Yan Zhang, Michael J. Black, Siyu Tang
We note that motion prediction methods accumulate errors over time, resulting in joints or markers that diverge from true human bodies.
1 code implementation • NeurIPS 2020 • Xiaohan Chen, Zhangyang Wang, Siyu Tang, Krikamol Muandet
Meta-learning improves generalization of machine learning models when faced with previously unseen tasks by leveraging experiences from different, yet related prior tasks.
no code implementations • 26 Nov 2020 • Miao Liu, Dexin Yang, Yan Zhang, Zhaopeng Cui, James M. Rehg, Siyu Tang
We introduce a novel task of reconstructing a time series of second-person 3D human body meshes from monocular egocentric videos.
1 code implementation • 12 Aug 2020 • Siwei Zhang, Yan Zhang, Qianli Ma, Michael J. Black, Siyu Tang
To synthesize realistic human-scene interactions, it is essential to effectively represent the physical contact and proximity between the body and the world.
4 code implementations • 10 Aug 2020 • Korrawe Karunratanakul, Jinlong Yang, Yan Zhang, Michael Black, Krikamol Muandet, Siyu Tang
Specifically, our generative model is able to synthesize high-quality human grasps, given only on a 3D object point cloud.
1 code implementation • ECCV 2020 • Xucong Zhang, Seonwook Park, Thabo Beeler, Derek Bradley, Siyu Tang, Otmar Hilliges
We show that our dataset can significantly improve the robustness of gaze estimation methods across different head poses and gaze angles.
Ranked #1 on
Gaze Estimation
on ETH-XGaze
(using extra training data)
no code implementations • 27 Jul 2020 • Yan Zhang, Michael J. Black, Siyu Tang
To address this problem, we propose a model to generate non-deterministic, \textit{ever-changing}, perpetual human motion, in which the global trajectory and the body pose are cross-conditioned.
3 code implementations • CVPR 2020 • Yan Zhang, Mohamed Hassan, Heiko Neumann, Michael J. Black, Siyu Tang
However, this is a challenging task for a computer as solving it requires that (1) the generated human bodies to be semantically plausible within the 3D environment (e. g. people sitting on the sofa or cooking near the stove), and (2) the generated human-scene interaction to be physically feasible such that the human body and scene do not interpenetrate while, at the same time, body-scene contact supports physical interactions.
1 code implementation • ECCV 2020 • Miao Liu, Siyu Tang, Yin Li, James Rehg
Motivated by this, we adopt intentional hand movement as a future representation and propose a novel deep network that jointly models and predicts the egocentric hand motion, interaction hotspots and future action.
2 code implementations • 24 Oct 2019 • Anurag Ranjan, David T. Hoffmann, Dimitrios Tzionas, Siyu Tang, Javier Romero, Michael J. Black
Therefore, we develop a dataset of multi-human optical flow and train optical flow networks on this dataset.
2 code implementations • 2 Aug 2019 • David T. Hoffmann, Dimitrios Tzionas, Micheal J. Black, Siyu Tang
Here we explore two variations of synthetic data for this challenging problem; a dataset with purely synthetic humans and a real dataset augmented with synthetic humans.
1 code implementation • CVPR 2020 • Qianli Ma, Jinlong Yang, Anurag Ranjan, Sergi Pujades, Gerard Pons-Moll, Siyu Tang, Michael J. Black
To our knowledge, this is the first generative model that directly dresses 3D human body meshes and generalizes to different poses.
1 code implementation • 3 Jun 2019 • Yan Zhang, Krikamol Muandet, Qianli Ma, Heiko Neumann, Siyu Tang
In this paper, we propose an approach to representing high-order information for temporal action segmentation via a simple yet effective bilinear form.
no code implementations • ICCV 2019 • Jie Song, Bjoern Andres, Michael Black, Otmar Hilliges, Siyu Tang
The new optimization problem can be viewed as a Conditional Random Field (CRF) in which the random variables are associated with the binary edge labels of the initial graph and the hard constraints are introduced in the CRF as high-order potentials.
1 code implementation • CVPR 2019 • Yan Zhang, Siyu Tang, Krikamol Muandet, Christian Jarvers, Heiko Neumann
Fine-grained temporal action parsing is important in many applications, such as daily activity understanding, human motion analysis, surgical robotics and others requiring subtle and precise operations in a long-term period.
no code implementations • ECCV 2018 • Yumin Suh, Jingdong Wang, Siyu Tang, Tao Mei, Kyoung Mu Lee
We propose a novel network that learns a part-aligned representation for person re-identification.
Ranked #4 on
Person Re-Identification
on UAV-Human
1 code implementation • 15 Mar 2018 • Yan Zhang, He Sun, Siyu Tang, Heiko Neumann
We present an effective dynamic clustering algorithm for the task of temporal human action segmentation, which has comprehensive applications such as robotics, motion analysis, and patient monitoring.
no code implementations • CVPR 2017 • Siyu Tang, Mykhaylo Andriluka, Bjoern Andres, Bernt Schiele
This allows us to reward tracks that assign detections of similar appearance to the same person in a way that does not introduce implausible solutions.
no code implementations • CVPR 2017 • Evgeny Levinkov, Jonas Uhrig, Siyu Tang, Mohamed Omran, Eldar Insafutdinov, Alexander Kirillov, Carsten Rother, Thomas Brox, Bernt Schiele, Bjoern Andres
In order to find feasible solutions efficiently, we define two local search algorithms that converge monotonously to a local optimum, offering a feasible solution at any time.
no code implementations • CVPR 2017 • Anna Rohrbach, Marcus Rohrbach, Siyu Tang, Seong Joon Oh, Bernt Schiele
At training time, we first learn how to localize characters by relating their visual appearance to mentions in the descriptions via a semi-supervised approach.
14 code implementations • CVPR 2017 • Eldar Insafutdinov, Mykhaylo Andriluka, Leonid Pishchulin, Siyu Tang, Evgeny Levinkov, Bjoern Andres, Bernt Schiele
In this paper we propose an approach for articulated tracking of multiple people in unconstrained videos.
Ranked #7 on
Keypoint Detection
on MPII Multi-Person
1 code implementation • 14 Nov 2016 • Evgeny Levinkov, Jonas Uhrig, Siyu Tang, Mohamed Omran, Eldar Insafutdinov, Alexander Kirillov, Carsten Rother, Thomas Brox, Bernt Schiele, Bjoern Andres
In order to find feasible solutions efficiently, we define two local search algorithms that converge monotonously to a local optimum, offering a feasible solution at any time.
no code implementations • 17 Aug 2016 • Siyu Tang, Bjoern Andres, Mykhaylo Andriluka, Bernt Schiele
In [1], we proposed a graph-based formulation that links and clusters person hypotheses over time by solving a minimum cost subgraph multicut problem.
no code implementations • 21 Jul 2016 • Margret Keuper, Siyu Tang, Yu Zhongjie, Bjoern Andres, Thomas Brox, Bernt Schiele
Recently, Minimum Cost Multicut Formulations have been proposed and proven to be successful in both motion trajectory segmentation and multi-target tracking scenarios.
4 code implementations • CVPR 2016 • Leonid Pishchulin, Eldar Insafutdinov, Siyu Tang, Bjoern Andres, Mykhaylo Andriluka, Peter Gehler, Bernt Schiele
This paper considers the task of articulated human pose estimation of multiple people in real world images.
Ranked #2 on
Multi-Person Pose Estimation
on WAF
no code implementations • CVPR 2015 • Siyu Tang, Bjoern Andres, Miykhaylo Andriluka, Bernt Schiele
Tracking multiple targets in a video, based on a finite set of detection hypotheses, is a persistent problem in computer vision.