Search Results for author: Siyu Tang

Found 66 papers, 35 papers with code

Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior

no code implementations16 Apr 2024 Yiqian Wu, Hao Xu, Xiangjun Tang, Xien Chen, Siyu Tang, Zhebin Zhang, Chen Li, Xiaogang Jin

Existing neural rendering-based text-to-3D-portrait generation methods typically make use of human geometry prior and diffusion models to obtain guidance.

Neural Rendering Text to 3D

Creating a Digital Twin of Spinal Surgery: A Proof of Concept

no code implementations25 Mar 2024 Jonas Hein, Frederic Giraud, Lilian Calvet, Alexander Schwarz, Nicola Alessandro Cavalcanti, Sergey Prokudin, Mazda Farshad, Siyu Tang, Marc Pollefeys, Fabio Carrillo, Philipp Fürnstahl

In this paper, we present a proof of concept (PoC) for surgery digitalization that is applied to an ex-vivo spinal surgery performed in realistic conditions.

3D Reconstruction Anatomy

Is Continual Learning Ready for Real-world Challenges?

no code implementations15 Feb 2024 Theodora Kontogianni, Yuanwen Yue, Siyu Tang, Konrad Schindler

Our paper aims to initiate a paradigm shift, advocating for the adoption of continual learning methods through new experimental protocols that better emulate real-world conditions to facilitate breakthroughs in the field.

3D Semantic Segmentation Continual Learning

EgoGen: An Egocentric Synthetic Data Generator

no code implementations16 Jan 2024 Gen Li, Kaifeng Zhao, Siwei Zhang, Xiaozhong Lyu, Mihai Dusmanu, Yan Zhang, Marc Pollefeys, Siyu Tang

To address this challenge, we introduce EgoGen, a new synthetic data generator that can produce accurate and rich ground-truth training data for egocentric perception tasks.

Human Mesh Recovery Motion Synthesis

RoHM: Robust Human Motion Reconstruction via Diffusion

no code implementations16 Jan 2024 Siwei Zhang, Bharat Lal Bhatnagar, Yuanlu Xu, Alexander Winkler, Petr Kadlecek, Siyu Tang, Federica Bogo

We apply RoHM to a variety of tasks -- from motion reconstruction and denoising to spatial and temporal infilling.


Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation

no code implementations9 Jan 2024 Xiyi Chen, Marko Mihajlovic, Shaofei Wang, Sergey Prokudin, Siyu Tang

To the best of our knowledge, our proposed framework is the first diffusion model to enable the creation of fully 3D-consistent, animatable, and photorealistic human avatars from a single image of an unseen subject; extensive quantitative and qualitative evaluations demonstrate the advantages of our approach over existing state-of-the-art avatar creation models on both novel view and novel expression synthesis tasks.

Novel View Synthesis

Optimizing Diffusion Noise Can Serve As Universal Motion Priors

no code implementations19 Dec 2023 Korrawe Karunratanakul, Konpat Preechakul, Emre Aksan, Thabo Beeler, Supasorn Suwajanakorn, Siyu Tang

We propose Diffusion Noise Optimization (DNO), a new method that effectively leverages existing motion diffusion models as motion priors for a wide range of motion-related tasks.


3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting

1 code implementation14 Dec 2023 Zhiyin Qian, Shaofei Wang, Marko Mihajlovic, Andreas Geiger, Siyu Tang

In this paper, we use 3D Gaussian Splatting and learn a non-rigid deformation network to reconstruct animatable clothed human avatars that can be trained within 30 minutes and rendered at real-time frame rates (50+ FPS).

Image Generation

IntrinsicAvatar: Physically Based Inverse Rendering of Dynamic Humans from Monocular Videos via Explicit Ray Tracing

no code implementations8 Dec 2023 Shaofei Wang, Božidar Antić, Andreas Geiger, Siyu Tang

We present IntrinsicAvatar, a novel approach to recovering the intrinsic properties of clothed human avatars including geometry, albedo, material, and environment lighting from only monocular videos.

Disentanglement Inverse Rendering +1

ResFields: Residual Neural Fields for Spatiotemporal Signals

1 code implementation6 Sep 2023 Marko Mihajlovic, Sergey Prokudin, Marc Pollefeys, Siyu Tang

Neural fields, a category of neural networks trained to represent high-frequency signals, have gained significant attention in recent years due to their impressive performance in modeling complex 3D data, such as signed distance (SDFs) or radiance fields (NeRFs), via a single multi-layer perceptron (MLP).

4D reconstruction Neural Rendering

Synthesizing Diverse Human Motions in 3D Indoor Scenes

no code implementations ICCV 2023 Kaifeng Zhao, Yan Zhang, Shaofei Wang, Thabo Beeler, Siyu Tang

We present a novel method for populating 3D indoor scenes with virtual humans that can navigate in the environment and interact with objects in a realistic manner.

Collision Avoidance Human-Object Interaction Detection +1

Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views

1 code implementation ICCV 2023 Siwei Zhang, Qianli Ma, Yan Zhang, Sadegh Aliakbarian, Darren Cosker, Siyu Tang

One of the biggest challenges of this task is severe body truncation due to close social distances in egocentric scenarios, which brings large pose ambiguities for unseen body parts.

Human Mesh Recovery

Dynamic Point Fields

no code implementations ICCV 2023 Sergey Prokudin, Qianli Ma, Maxime Raafat, Julien Valentin, Siyu Tang

In this work, we present a dynamic point field model that combines the representational benefits of explicit point-based graphics with implicit deformation networks to allow efficient modeling of non-rigid 3D surfaces.

Surface Reconstruction

Factor Fields: A Unified Framework for Neural Fields and Beyond

1 code implementation2 Feb 2023 Anpei Chen, Zexiang Xu, Xinyue Wei, Siyu Tang, Hao Su, Andreas Geiger

Our experiments show that DiF leads to improvements in approximation quality, compactness, and training time when compared to previous fast reconstruction methods.


HARP: Personalized Hand Reconstruction from a Monocular RGB Video

no code implementations CVPR 2023 Korrawe Karunratanakul, Sergey Prokudin, Otmar Hilliges, Siyu Tang

We present HARP (HAnd Reconstruction and Personalization), a personalized hand avatar creation approach that takes a short monocular RGB video of a human hand as input and reconstructs a faithful hand avatar exhibiting a high-fidelity appearance and geometry.

3D Hand Pose Estimation

ARAH: Animatable Volume Rendering of Articulated Human SDFs

no code implementations18 Oct 2022 Shaofei Wang, Katja Schwarz, Andreas Geiger, Siyu Tang

We demonstrate that our proposed pipeline can generate clothed avatars with high-quality pose-dependent geometry and appearance from a sparse set of multi-view RGB videos.

Mask3D: Mask Transformer for 3D Semantic Instance Segmentation

1 code implementation6 Oct 2022 Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe

Modern 3D semantic instance segmentation approaches predominantly rely on specialized voting mechanisms followed by carefully designed geometric clustering techniques.

3D Instance Segmentation 3D Semantic Instance Segmentation +1

Neural Point-based Shape Modeling of Humans in Challenging Clothing

no code implementations14 Sep 2022 Qianli Ma, Jinlong Yang, Michael J. Black, Siyu Tang

Specifically, we extend point-based methods with a coarse stage, that replaces canonicalization with a learned pose-independent "coarse shape" that can capture the rough surface geometry of clothing like skirts.

Compositional Human-Scene Interaction Synthesis with Semantic Control

1 code implementation26 Jul 2022 Kaifeng Zhao, Shaofei Wang, Yan Zhang, Thabo Beeler, Siyu Tang

Furthermore, inspired by the compositional nature of interactions that humans can simultaneously interact with multiple objects, we define interaction semantics as the composition of varying numbers of atomic action-object pairs.

Instance Segmentation Semantic Segmentation

Accurate 3D Body Shape Regression using Metric and Semantic Attributes

1 code implementation CVPR 2022 Vasileios Choutas, Lea Muller, Chun-Hao P. Huang, Siyu Tang, Dimitrios Tzionas, Michael J. Black

Since paired data with images and 3D body shape are rare, we exploit two sources of information: (1) we collect internet images of diverse "fashion" models together with a small set of anthropometric measurements; (2) we collect linguistic shape attributes for a wide range of 3D body meshes and the model images.

3D Human Reconstruction 3D Human Shape Estimation

KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints

1 code implementation10 May 2022 Marko Mihajlovic, Aayush Bansal, Michael Zollhoefer, Siyu Tang, Shunsuke Saito

In this work, we investigate common issues with existing spatial encodings and propose a simple yet highly effective approach to modeling high-fidelity volumetric humans from sparse views.

3D Face Reconstruction 3D Human Reconstruction +2

Context-Aware Sequence Alignment using 4D Skeletal Augmentation

1 code implementation CVPR 2022 Taein Kwon, Bugra Tekin, Siyu Tang, Marc Pollefeys

Temporal alignment of fine-grained human actions in videos is important for numerous applications in computer vision, robotics, and mixed reality.

Hand Pose Estimation Mixed Reality +1

Interactive Object Segmentation in 3D Point Clouds

1 code implementation14 Apr 2022 Theodora Kontogianni, Ekin Celikkan, Siyu Tang, Konrad Schindler

We propose an interactive approach for 3D instance segmentation, where users can iteratively collaborate with a deep learning model to segment objects in a 3D point cloud directly.

3D Instance Segmentation Image Segmentation +4

Human-Aware Object Placement for Visual Environment Reconstruction

1 code implementation CVPR 2022 Hongwei Yi, Chun-Hao P. Huang, Dimitrios Tzionas, Muhammed Kocabas, Mohamed Hassan, Siyu Tang, Justus Thies, Michael J. Black

In fact, we demonstrate that these human-scene interactions (HSIs) can be leveraged to improve the 3D reconstruction of a scene from a monocular RGB video.

3D Reconstruction Object

SAGA: Stochastic Whole-Body Grasping with Contact

1 code implementation19 Dec 2021 Yan Wu, Jiahao Wang, Yan Zhang, Siwei Zhang, Otmar Hilliges, Fisher Yu, Siyu Tang

Given an initial pose and the generated whole-body grasping pose as the start and end of the motion respectively, we design a novel contact-aware generative motion infilling module to generate a diverse set of grasp-oriented motions.


The Wanderings of Odysseus in 3D Scenes

no code implementations CVPR 2022 Yan Zhang, Siyu Tang

In our solution, we decompose the long-term motion into a time sequence of motion primitives.

EgoBody: Human Body Shape and Motion of Interacting People from Head-Mounted Devices

1 code implementation14 Dec 2021 Siwei Zhang, Qianli Ma, Yan Zhang, Zhiyin Qian, Taein Kwon, Marc Pollefeys, Federica Bogo, Siyu Tang

Key to reasoning about interactions is to understand the body pose and motion of the interaction partner from the egocentric view.

Motion Estimation

A Skeleton-Driven Neural Occupancy Representation for Articulated Hands

no code implementations23 Sep 2021 Korrawe Karunratanakul, Adrian Spurr, Zicong Fan, Otmar Hilliges, Siyu Tang

We present Hand ArticuLated Occupancy (HALO), a novel representation of articulated hands that bridges the advantages of 3D keypoints and neural implicit surfaces and can be used in end-to-end trainable architectures.

The Power of Points for Modeling Humans in Clothing

no code implementations ICCV 2021 Qianli Ma, Jinlong Yang, Siyu Tang, Michael J. Black

The geometry feature can be optimized to fit a previously unseen scan of a person in clothing, enabling the scan to be reposed realistically.

Learning Motion Priors for 4D Human Body Capture in 3D Scenes

1 code implementation ICCV 2021 Siwei Zhang, Yan Zhang, Federica Bogo, Marc Pollefeys, Siyu Tang

To prove the effectiveness of the proposed motion priors, we combine them into a novel pipeline for 4D human body capture in 3D scenes.


MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

1 code implementation NeurIPS 2021 Shaofei Wang, Marko Mihajlovic, Qianli Ma, Andreas Geiger, Siyu Tang

In contrast, we propose an approach that can quickly generate realistic clothed human avatars, represented as controllable neural SDFs, given only monocular depth images.


Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration

no code implementations CVPR 2021 Shaofei Wang, Andreas Geiger, Siyu Tang

We combine PTF with multi-class occupancy networks, obtaining a novel learning-based framework that learns to simultaneously predict shape and per-point correspondences between the posed space and the canonical space for clothed human.

Surface Reconstruction Translation

LEAP: Learning Articulated Occupancy of People

1 code implementation CVPR 2021 Marko Mihajlovic, Yan Zhang, Michael J. Black, Siyu Tang

Substantial progress has been made on modeling rigid 3D objects using deep implicit representations.

On Self-Contact and Human Pose

1 code implementation CVPR 2021 Lea Müller, Ahmed A. A. Osman, Siyu Tang, Chun-Hao P. Huang, Michael J. Black

Third, we develop a novel HPS optimization method, SMPLify-XMC, that includes contact constraints and uses the known 3DCP body pose during fitting to create near ground-truth poses for MTP images.

Ranked #73 on 3D Human Pose Estimation on 3DPW (MPJPE metric)

3D Human Pose Estimation

We are More than Our Joints: Predicting how 3D Bodies Move

no code implementations CVPR 2021 Yan Zhang, Michael J. Black, Siyu Tang

We note that motion prediction methods accumulate errors over time, resulting in joints or markers that diverge from true human bodies.

Human motion prediction motion prediction +2

MATE: Plugging in Model Awareness to Task Embedding for Meta Learning

1 code implementation NeurIPS 2020 Xiaohan Chen, Zhangyang Wang, Siyu Tang, Krikamol Muandet

Meta-learning improves generalization of machine learning models when faced with previously unseen tasks by leveraging experiences from different, yet related prior tasks.

feature selection Few-Shot Learning

4D Human Body Capture from Egocentric Video via 3D Scene Grounding

no code implementations26 Nov 2020 Miao Liu, Dexin Yang, Yan Zhang, Zhaopeng Cui, James M. Rehg, Siyu Tang

We introduce a novel task of reconstructing a time series of second-person 3D human body meshes from monocular egocentric videos.

Time Series Time Series Analysis

PLACE: Proximity Learning of Articulation and Contact in 3D Environments

1 code implementation12 Aug 2020 Siwei Zhang, Yan Zhang, Qianli Ma, Michael J. Black, Siyu Tang

To synthesize realistic human-scene interactions, it is essential to effectively represent the physical contact and proximity between the body and the world.

Grasping Field: Learning Implicit Representations for Human Grasps

3 code implementations10 Aug 2020 Korrawe Karunratanakul, Jinlong Yang, Yan Zhang, Michael Black, Krikamol Muandet, Siyu Tang

Specifically, our generative model is able to synthesize high-quality human grasps, given only on a 3D object point cloud.

3D Object Reconstruction Grasp Generation +2

ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Pose and Gaze Variation

1 code implementation ECCV 2020 Xucong Zhang, Seonwook Park, Thabo Beeler, Derek Bradley, Siyu Tang, Otmar Hilliges

We show that our dataset can significantly improve the robustness of gaze estimation methods across different head poses and gaze angles.

 Ranked #1 on Gaze Estimation on ETH-XGaze (using extra training data)

Gaze Estimation

Perpetual Motion: Generating Unbounded Human Motion

no code implementations27 Jul 2020 Yan Zhang, Michael J. Black, Siyu Tang

To address this problem, we propose a model to generate non-deterministic, \textit{ever-changing}, perpetual human motion, in which the global trajectory and the body pose are cross-conditioned.

Motion Estimation Time Series Analysis

Generating 3D People in Scenes without People

3 code implementations CVPR 2020 Yan Zhang, Mohamed Hassan, Heiko Neumann, Michael J. Black, Siyu Tang

However, this is a challenging task for a computer as solving it requires that (1) the generated human bodies to be semantically plausible within the 3D environment (e. g. people sitting on the sofa or cooking near the stove), and (2) the generated human-scene interaction to be physically feasible such that the human body and scene do not interpenetrate while, at the same time, body-scene contact supports physical interactions.

Pose Estimation

Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video

1 code implementation ECCV 2020 Miao Liu, Siyu Tang, Yin Li, James Rehg

Motivated by this, we adopt intentional hand movement as a future representation and propose a novel deep network that jointly models and predicts the egocentric hand motion, interaction hotspots and future action.

Action Anticipation Human-Object Interaction Detection

Learning Multi-Human Optical Flow

2 code implementations24 Oct 2019 Anurag Ranjan, David T. Hoffmann, Dimitrios Tzionas, Siyu Tang, Javier Romero, Michael J. Black

Therefore, we develop a dataset of multi-human optical flow and train optical flow networks on this dataset.

Optical Flow Estimation

Learning to Train with Synthetic Humans

2 code implementations2 Aug 2019 David T. Hoffmann, Dimitrios Tzionas, Micheal J. Black, Siyu Tang

Here we explore two variations of synthetic data for this challenging problem; a dataset with purely synthetic humans and a real dataset augmented with synthetic humans.

2D Pose Estimation Pose Estimation

Frontal Low-rank Random Tensors for Fine-grained Action Segmentation

1 code implementation3 Jun 2019 Yan Zhang, Krikamol Muandet, Qianli Ma, Heiko Neumann, Siyu Tang

In this paper, we propose an approach to representing high-order information for temporal action segmentation via a simple yet effective bilinear form.

Action Parsing Action Segmentation +1

End-to-end Learning for Graph Decomposition

no code implementations ICCV 2019 Jie Song, Bjoern Andres, Michael Black, Otmar Hilliges, Siyu Tang

The new optimization problem can be viewed as a Conditional Random Field (CRF) in which the random variables are associated with the binary edge labels of the initial graph and the hard constraints are introduced in the CRF as high-order potentials.

Clustering Multi-Person Pose Estimation

Local Temporal Bilinear Pooling for Fine-grained Action Parsing

1 code implementation CVPR 2019 Yan Zhang, Siyu Tang, Krikamol Muandet, Christian Jarvers, Heiko Neumann

Fine-grained temporal action parsing is important in many applications, such as daily activity understanding, human motion analysis, surgical robotics and others requiring subtle and precise operations in a long-term period.

Action Parsing

Temporal Human Action Segmentation via Dynamic Clustering

1 code implementation15 Mar 2018 Yan Zhang, He Sun, Siyu Tang, Heiko Neumann

We present an effective dynamic clustering algorithm for the task of temporal human action segmentation, which has comprehensive applications such as robotics, motion analysis, and patient monitoring.

Action Segmentation Clustering

Multiple People Tracking by Lifted Multicut and Person Re-Identification

no code implementations CVPR 2017 Siyu Tang, Mykhaylo Andriluka, Bjoern Andres, Bernt Schiele

This allows us to reward tracks that assign detections of similar appearance to the same person in a way that does not introduce implausible solutions.

Multiple People Tracking Person Re-Identification +1

Generating Descriptions with Grounded and Co-Referenced People

no code implementations CVPR 2017 Anna Rohrbach, Marcus Rohrbach, Siyu Tang, Seong Joon Oh, Bernt Schiele

At training time, we first learn how to localize characters by relating their visual appearance to mentions in the descriptions via a semi-supervised approach.

Joint Graph Decomposition and Node Labeling: Problem, Algorithms, Applications

1 code implementation14 Nov 2016 Evgeny Levinkov, Jonas Uhrig, Siyu Tang, Mohamed Omran, Eldar Insafutdinov, Alexander Kirillov, Carsten Rother, Thomas Brox, Bernt Schiele, Bjoern Andres

In order to find feasible solutions efficiently, we define two local search algorithms that converge monotonously to a local optimum, offering a feasible solution at any time.

Combinatorial Optimization Multiple Object Tracking +2

Multi-Person Tracking by Multicut and Deep Matching

no code implementations17 Aug 2016 Siyu Tang, Bjoern Andres, Mykhaylo Andriluka, Bernt Schiele

In [1], we proposed a graph-based formulation that links and clusters person hypotheses over time by solving a minimum cost subgraph multicut problem.

A Multi-cut Formulation for Joint Segmentation and Tracking of Multiple Objects

no code implementations21 Jul 2016 Margret Keuper, Siyu Tang, Yu Zhongjie, Bjoern Andres, Thomas Brox, Bernt Schiele

Recently, Minimum Cost Multicut Formulations have been proposed and proven to be successful in both motion trajectory segmentation and multi-target tracking scenarios.

Motion Segmentation object-detection +2

Subgraph Decomposition for Multi-Target Tracking

no code implementations CVPR 2015 Siyu Tang, Bjoern Andres, Miykhaylo Andriluka, Bernt Schiele

Tracking multiple targets in a video, based on a finite set of detection hypotheses, is a persistent problem in computer vision.


Cannot find the paper you are looking for? You can Submit a new open access paper.