Search Results for author: Cristian Sminchisescu

Found 76 papers, 16 papers with code

VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis

no code implementations13 Mar 2024 Enric Corona, Andrei Zanfir, Eduard Gabriel Bazavan, Nikos Kolotouros, Thiemo Alldieck, Cristian Sminchisescu

We propose VLOGGER, a method for audio-driven human video generation from a single input image of a person, which builds on the success of recent generative diffusion models.

Face Detection Video Editing +1

Score Distillation Sampling with Learned Manifold Corrective

no code implementations10 Jan 2024 Thiemo Alldieck, Nikos Kolotouros, Cristian Sminchisescu

Score Distillation Sampling (SDS) is a recent but already widely popular method that relies on an image diffusion model to control optimization problems using text prompts.

Denoising Image Generation +1

Personalized Pose Forecasting

no code implementations6 Dec 2023 Maria Priisalu, Ted Kronvall, Cristian Sminchisescu

Human pose forecasting is the task of predicting articulated human motion given past human motion.

Human Pose Forecasting Motion Forecasting +2

SPHEAR: Spherical Head Registration for Complete Statistical 3D Modeling

no code implementations4 Nov 2023 Eduard Gabriel Bazavan, Andrei Zanfir, Thiemo Alldieck, Teodor Alexandru Szente, Mihai Zanfir, Cristian Sminchisescu

We present \emph{SPHEAR}, an accurate, differentiable parametric statistical 3D human head model, enabled by a novel 3D registration method based on spherical embeddings.

Blendshapes GHUM: Real-time Monocular Facial Blendshape Prediction

no code implementations11 Sep 2023 Ivan Grishchenko, Geng Yan, Eduard Gabriel Bazavan, Andrei Zanfir, Nikolai Chinaev, Karthik Raveendran, Matthias Grundmann, Cristian Sminchisescu

We present Blendshapes GHUM, an on-device ML pipeline that predicts 52 facial blendshape coefficients at 30+ FPS on modern mobile phones, from a single monocular RGB image and enables facial motion capture applications like virtual avatars.

PhoMoH: Implicit Photorealistic 3D Models of Human Heads

no code implementations14 Dec 2022 Mihai Zanfir, Thiemo Alldieck, Cristian Sminchisescu

We present PhoMoH, a neural network methodology to construct generative models of photo-realistic 3D geometry and appearance of human heads including hair, beards, an oral cavity, and clothing.

Structured 3D Features for Reconstructing Controllable Avatars

no code implementations CVPR 2023 Enric Corona, Mihai Zanfir, Thiemo Alldieck, Eduard Gabriel Bazavan, Andrei Zanfir, Cristian Sminchisescu

We introduce Structured 3D Features, a model based on a novel implicit 3D representation that pools pixel-aligned image features onto dense 3D points sampled from a parametric, statistical human mesh surface.

3D Reconstruction Novel View Synthesis +1

Transformer-Based Learned Optimization

no code implementations CVPR 2023 Erik Gärtner, Luke Metz, Mykhaylo Andriluka, C. Daniel Freeman, Cristian Sminchisescu

We propose a new approach to learned optimization where we represent the computation of an optimizer's update step using a neural network.

BlazePose GHUM Holistic: Real-time 3D Human Landmarks and Pose Estimation

no code implementations23 Jun 2022 Ivan Grishchenko, Valentin Bazarevsky, Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, Richard Yee, Karthik Raveendran, Matsvei Zhdanovich, Matthias Grundmann, Cristian Sminchisescu

We present BlazePose GHUM Holistic, a lightweight neural network pipeline for 3D human body landmarks and pose estimation, specifically tailored to real-time on-device inference.

3D Human Pose Estimation

Photorealistic Monocular 3D Reconstruction of Humans Wearing Clothing

no code implementations CVPR 2022 Thiemo Alldieck, Mihai Zanfir, Cristian Sminchisescu

We present PHORHUM, a novel, end-to-end trainable, deep neural network methodology for photorealistic 3D human reconstruction given just a monocular RGB image.

3D Human Reconstruction 3D Reconstruction

BEHAVE: Dataset and Method for Tracking Human Object Interactions

1 code implementation CVPR 2022 Bharat Lal Bhatnagar, Xianghui Xie, Ilya A. Petrov, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll

We present BEHAVE dataset, the first full body human- object interaction dataset with multi-view RGBD frames and corresponding 3D SMPL and object fits along with the annotated contacts between them.

Human-Object Interaction Detection Mixed Reality +1

Learning Online Multi-Sensor Depth Fusion

1 code implementation7 Apr 2022 Erik Sandström, Martin R. Oswald, Suryansh Kumar, Silvan Weder, Fisher Yu, Cristian Sminchisescu, Luc van Gool

Multi-sensor depth fusion is able to substantially improve the robustness and accuracy of 3D reconstruction methods, but existing techniques are not robust enough to handle sensors which operate with diverse value ranges as well as noise and outlier statistics.

3D Reconstruction Mixed Reality +1

Discriminating Against Unrealistic Interpolations in Generative Adversarial Networks

1 code implementation2 Mar 2022 Henning Petzka, Ted Kronvall, Cristian Sminchisescu

By reusing the discriminator network to modify the metric on the latent space, we propose a lightweight solution for improved interpolations in pre-trained GANs.

Embodied Learning for Lifelong Visual Perception

no code implementations28 Dec 2021 David Nilsson, Aleksis Pirinen, Erik Gärtner, Cristian Sminchisescu

As we study this task in a lifelong learning context, the agents should use knowledge gained in earlier visited environments in order to guide their exploration and active learning strategy in successively visited buildings.

Active Learning Navigate +2

HSPACE: Synthetic Parametric Humans Animated in Complex Environments

no code implementations23 Dec 2021 Eduard Gabriel Bazavan, Andrei Zanfir, Mihai Zanfir, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu

We combine a hundred diverse individuals of varying ages, gender, proportions, and ethnicity, with hundreds of motions and scenes, as well as parametric variations in body shape (for a total of 1, 600 different humans), in order to generate an initial dataset of over 1 million frames.

3D Human Pose Estimation Scene Understanding

imGHUM: Implicit Generative Models of 3D Human Shape and Articulated Pose

1 code implementation ICCV 2021 Thiemo Alldieck, Hongyi Xu, Cristian Sminchisescu

We present imGHUM, the first holistic generative model of 3D human shape and articulated pose, represented as a signed distance function.

AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training

no code implementations CVPR 2021 Mihai Fieraru, Mihai Zanfir, Silviu Cristian Pirlea, Vlad Olaru, Cristian Sminchisescu

AIFit is able to reconstruct 3d human pose and motion, reliably segment exercise repetitions, and identify in real-time the deviations between standards learnt from trainers, and the execution of a trainee.

Visual Grounding

TropEx: An Algorithm for Extracting Linear Terms in Deep Neural Networks

no code implementations ICLR 2021 Martin Trimmel, Henning Petzka, Cristian Sminchisescu

Deep neural networks with rectified linear (ReLU) activations are piecewise linear functions, where hyperplanes partition the input space into an astronomically high number of linear regions.

Learning Complex 3D Human Self-Contact

no code implementations18 Dec 2020 Mihai Fieraru, Mihai Zanfir, Elisabeta Oneata, Alin-Ionut Popa, Vlad Olaru, Cristian Sminchisescu

Monocular estimation of three dimensional human self-contact is fundamental for detailed scene analysis including body language understanding and behaviour modeling.

3D Reconstruction

Embodied Visual Active Learning for Semantic Segmentation

no code implementations17 Dec 2020 David Nilsson, Aleksis Pirinen, Erik Gärtner, Cristian Sminchisescu

We study the task of embodied visual active learning, where an agent is set to explore a 3d environment with the goal to acquire visual scene understanding by actively selecting views for which to request annotation.

Active Learning Scene Understanding +1

LoopReg: Self-supervised Learning of Implicit Surface Correspondences, Pose and Shape for 3D Human Mesh Registration

no code implementations NeurIPS 2020 Bharat Lal Bhatnagar, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll

Formulating this closed loop is not straightforward because it is not trivial to force the output of the NN to be on the surface of the human model - outside this surface the human model is not even defined.

Self-Supervised Learning

Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction

1 code implementation ECCV 2020 Bharat Lal Bhatnagar, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll

In this work, we present methodology that combines detail-rich implicit functions and parametric representations in order to reconstruct 3D models of people that remain controllable and accurate even in the presence of clothing.

3D Human Pose Estimation 3D Human Reconstruction

Calibration of Neural Networks using Splines

1 code implementation ICLR 2021 Kartik Gupta, Amir Rahimi, Thalaiyasingam Ajanthan, Thomas Mensink, Cristian Sminchisescu, Richard Hartley

From this, by approximating the empirical cumulative distribution using a differentiable function via splines, we obtain a recalibration function, which maps the network outputs to actual (calibrated) class assignment probabilities.

Decision Making Image Classification

Post-hoc Calibration of Neural Networks by g-Layers

no code implementations23 Jun 2020 Amir Rahimi, Thomas Mensink, Kartik Gupta, Thalaiyasingam Ajanthan, Cristian Sminchisescu, Richard Hartley

Calibration of neural networks is a critical aspect to consider when incorporating machine learning models in real-world decision-making systems where the confidence of decisions are equally important as the decisions themselves.

Decision Making Image Classification

Consistency Guided Scene Flow Estimation

no code implementations ECCV 2020 Yuhua Chen, Luc van Gool, Cordelia Schmid, Cristian Sminchisescu

To handle inherent modeling error in the consistency loss (e. g. Lambertian assumptions) and for better generalization, we further introduce a learned, output refinement network, which takes the initial predictions, the loss, and the gradient as input, and efficiently predicts a correlated output update.

Scene Flow Estimation

Range Conditioned Dilated Convolutions for Scale Invariant 3D Object Detection

1 code implementation20 May 2020 Alex Bewley, Pei Sun, Thomas Mensink, Dragomir Anguelov, Cristian Sminchisescu

This paper presents a novel 3D object detection framework that processes LiDAR data directly on its native representation: range images.

3D Object Detection Autonomous Driving +1

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows

no code implementations ECCV 2020 Andrei Zanfir, Eduard Gabriel Bazavan, Hongyi Xu, Bill Freeman, Rahul Sukthankar, Cristian Sminchisescu

Monocular 3D human pose and shape estimation is challenging due to the many degrees of freedom of the human body and thedifficulty to acquire training data for large-scale supervised learning in complex visual scenes.

Ranked #54 on 3D Human Pose Estimation on 3DPW (PA-MPJPE metric)

3D human pose and shape estimation Self-Supervised Learning

Deep Reinforcement Learning for Active Human Pose Estimation

1 code implementation7 Jan 2020 Erik Gärtner, Aleksis Pirinen, Cristian Sminchisescu

Most 3d human pose estimation methods assume that input -- be it images of a scene collected from one or several viewpoints, or from a video -- is given.

3D Human Pose Estimation reinforcement-learning +1

Relative Flatness and Generalization

1 code implementation NeurIPS 2021 Henning Petzka, Michael Kamp, Linara Adilova, Cristian Sminchisescu, Mario Boley

Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks.

Generalization Bounds

Domes to Drones: Self-Supervised Active Triangulation for 3D Human Pose Reconstruction

1 code implementation NeurIPS 2019 Aleksis Pirinen, Erik Gärtner, Cristian Sminchisescu

In order to address the view selection problem in a principled way, we here introduce ACTOR, an active triangulation agent for 3d human pose reconstruction.

2D Pose Estimation 3D Pose Estimation +1

A Reparameterization-Invariant Flatness Measure for Deep Neural Networks

no code implementations29 Nov 2019 Henning Petzka, Linara Adilova, Michael Kamp, Cristian Sminchisescu

The performance of deep neural networks is often attributed to their automated, task-related feature construction.

Open-Ended Question Answering

Feature-Robustness, Flatness and Generalization Error for Deep Neural Networks

no code implementations25 Sep 2019 Henning Petzka, Linara Adilova, Michael Kamp, Cristian Sminchisescu

With this, the generalization error of a model trained on representative data can be bounded by its feature robustness which depends on our novel flatness measure.

Open-Ended Question Answering

Human Synthesis and Scene Compositing

no code implementations23 Sep 2019 Mihai Zanfir, Elisabeta Oneata, Alin-Ionut Popa, Andrei Zanfir, Cristian Sminchisescu

Generating good quality and geometrically plausible synthetic images of humans with the ability to control appearance, pose and shape parameters, has become increasingly important for a variety of tasks ranging from photo editing, fashion virtual try-on, to special effects and image compression.

Image Compression Image Generation +1

Self-supervised Learning with Geometric Constraints in Monocular Video: Connecting Flow, Depth, and Camera

no code implementations ICCV 2019 Yuhua Chen, Cordelia Schmid, Cristian Sminchisescu

We present GLNet, a self-supervised framework for learning depth, optical flow, camera pose and intrinsic parameters from monocular video - addressing the difficulty of acquiring realistic ground-truth for such tasks.

Monocular Depth Estimation Optical Flow Estimation +3

Non-attracting Regions of Local Minima in Deep and Wide Neural Networks

no code implementations16 Dec 2018 Henning Petzka, Cristian Sminchisescu

For extremely wide neural networks of decreasing width after the wide layer, we prove that every suboptimal local minimum belongs to such a connected set.

Deep Network for the Integrated 3D Sensing of Multiple People in Natural Images

no code implementations NeurIPS 2018 Andrei Zanfir, Elisabeta Marinoiu, Mihai Zanfir, Alin-Ionut Popa, Cristian Sminchisescu

The final stage of 3d pose and shape prediction is based on a learned attention process where information from different human body parts is optimally integrated.

Deep Reinforcement Learning of Region Proposal Networks for Object Detection

1 code implementation CVPR 2018 Aleksis Pirinen, Cristian Sminchisescu

We propose drl-RPN, a deep reinforcement learning-based visual recognition model consisting of a sequential region proposal network (RPN) and an object detector.

Object object-detection +4

Monocular 3D Pose and Shape Estimation of Multiple People in Natural Scenes - The Importance of Multiple Scene Constraints

no code implementations CVPR 2018 Andrei Zanfir, Elisabeta Marinoiu, Cristian Sminchisescu

Human sensing has greatly benefited from recent advances in deep learning, parametric human modeling, and large scale 2d and 3d datasets.

Deep Learning of Graph Matching

no code implementations CVPR 2018 Andrei Zanfir, Cristian Sminchisescu

The problem of graph matching under node and pair-wise constraints is fundamental in areas as diverse as combinatorial optimization, machine learning or computer vision, where representing both the relations between nodes and their neighborhood structure is essential.

Combinatorial Optimization Graph Matching

Human Appearance Transfer

no code implementations CVPR 2018 Mihai Zanfir, Alin-Ionut Popa, Andrei Zanfir, Cristian Sminchisescu

We propose an automatic person-to-person appearance transfer model based on explicit parametric 3d human representations and learned, constrained deep translation network architectures for photographic image synthesis.

Image Generation

3D Human Sensing, Action and Emotion Recognition in Robot Assisted Therapy of Children With Autism

no code implementations CVPR 2018 Elisabeta Marinoiu, Mihai Zanfir, Vlad Olaru, Cristian Sminchisescu

We introduce new, fine-grained action and emotion recognition tasks defined on non-staged videos, recorded during robot-assisted therapy sessions of children with autism.

Emotion Recognition

Deep Multitask Architecture for Integrated 2D and 3D Human Sensing

no code implementations CVPR 2017 Alin-Ionut Popa, Mihai Zanfir, Cristian Sminchisescu

We propose a deep multitask architecture for \emph{fully automatic 2d and 3d human sensing} (DMHS), including \emph{recognition and reconstruction}, in \emph{monocular images}.

3D Human Pose Estimation

Semantic Video Segmentation by Gated Recurrent Flow Propagation

no code implementations CVPR 2018 David Nilsson, Cristian Sminchisescu

In this paper we present a deep, end-to-end trainable methodology to video segmentation that is capable of leveraging information present in unlabeled data in order to improve semantic estimates.

Optical Flow Estimation Segmentation +3

Reinforcement Learning for Visual Object Detection

no code implementations CVPR 2016 Stefan Mathe, Aleksis Pirinen, Cristian Sminchisescu

One of the most widely used strategies for visual object detection is based on exhaustive spatial hypothesis search.

Object object-detection +3

Matrix Backpropagation for Deep Networks With Structured Layers

no code implementations ICCV 2015 Catalin Ionescu, Orestis Vantzos, Cristian Sminchisescu

Deep neural network architectures have recently produced excellent results in a variety of areas in artificial intelligence and visual recognition, well surpassing traditional shallow architectures trained using hand-designed features.

Large Displacement 3D Scene Flow With Occlusion Reasoning

no code implementations ICCV 2015 Andrei Zanfir, Cristian Sminchisescu

In this paper we propose a novel coarse to fine correspondence-based scene flow approach to account for the effects of large displacements and to model occlusion, based on explicit geometric reasoning.

Motion Estimation Optical Flow Estimation

Training Deep Networks with Structured Layers by Matrix Backpropagation

1 code implementation25 Sep 2015 Catalin Ionescu, Orestis Vantzos, Cristian Sminchisescu

Deep neural network architectures have recently produced excellent results in a variety of areas in artificial intelligence and visual recognition, well surpassing traditional shallow architectures trained using hand-designed features.

A Parallel Framework for Parametric Maximum Flow Problems in Image Segmentation

no code implementations20 Sep 2015 Vlad Olaru, Mihai Florea, Cristian Sminchisescu

This paper presents a framework that supports the implementation of parallel solutions for the widespread parametric maximum flow computational routines used in image segmentation algorithms.

Image Segmentation Segmentation +1

Second-Order Constrained Parametric Proposals and Sequential Search-Based Structured Prediction for Semantic Segmentation in RGB-D Images

no code implementations CVPR 2015 Dan Banica, Cristian Sminchisescu

We focus on the problem of semantic segmentation based on RGB-D data, with emphasis on analyzing cluttered indoor scenes containing many visual categories and instances.

Scene Classification Segmentation +2

A Framework for Symmetric Part Detection in Cluttered Scenes

no code implementations5 Feb 2015 Tom Lee, Sanja Fidler, Alex Levinshtein, Cristian Sminchisescu, Sven Dickinson

The role of symmetry in computer vision has waxed and waned in importance during the evolution of the field from its earliest days.

Parametric Image Segmentation of Humans with Structural Shape Priors

no code implementations27 Jan 2015 Alin-Ionut Popa, Cristian Sminchisescu

The figure-ground segmentation of humans in images captured in natural environments is an outstanding open problem due to the presence of complex backgrounds, articulation, varying body proportions, partial views and viewpoint changes.

Image Segmentation Segmentation +1

Multiple Instance Reinforcement Learning for Efficient Weakly-Supervised Detection in Images

no code implementations29 Nov 2014 Stefan Mathe, Cristian Sminchisescu

State-of-the-art visual recognition and detection systems increasingly rely on large amounts of training data and complex classifiers.

Action Detection reinforcement-learning +2

Iterated Second-Order Label Sensitive Pooling for 3D Human Pose Estimation

no code implementations CVPR 2014 Catalin Ionescu, Joao Carreira, Cristian Sminchisescu

Recently, the emergence of Kinect systems has demonstrated the benefits of predicting an intermediate body part labeling for 3D human pose estimation, in conjunction with RGB-D imagery.

3D Human Pose Estimation

Constrained Parametric Proposals and Pooling Methods for Semantic Segmentation in RGB-D Images

no code implementations30 Dec 2013 Dan Banica, Cristian Sminchisescu

Our contributions can be summarized as proposing the following: (1) a generalization of parametric max flow figure-ground proposal methodology to take advantage of intensity and depth information, in order to systematically and efficiently generate the breakpoints of an underlying spatial model in polynomial time, (2) new region description methods based on second-order pooling over multiple features constructed using both intensity and depth channels, (3) an inference procedure that can resolve conflicts in overlapping spatial partitions, and handles scenes with a large number of objects category instances, of very different scales, (4) extensive evaluation of the impact of depth, as well as the effectiveness of a large number of descriptors, both pre-designed and automatically obtained using deep learning, in a difficult RGB-D semantic segmentation problem with 92 classes.

Scene Classification Segmentation +1

Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition

no code implementations29 Dec 2013 Stefan Mathe, Cristian Sminchisescu

Systems based on bag-of-words models from image features collected at maxima of sparse interest point operators have been used successfully for both computer visual object and action recognition tasks.

Action Recognition Temporal Action Localization

Human3.6m: Large scale datasets and predictive methods for 3D human sensing in natural environments

1 code implementation IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 36 , Issue: 7 , July 2014 ) 2013 Catalin Ionescu, Dragos Papava, Vlad Olaru, Cristian Sminchisescu

We introduce a new dataset, Human3. 6M, of 3. 6 Million accurate 3D Human poses, acquired by recording the performance of 5 female and 6 male subjects, under 4 different viewpoints, for training realistic human sensing systems and for evaluating the next generation of human pose estimation models and algorithms.

3D Human Pose Estimation Mixed Reality

Action from Still Image Dataset and Inverse Optimal Control to Learn Task Specific Visual Scanpaths

no code implementations NeurIPS 2013 Stefan Mathe, Cristian Sminchisescu

Finally, we leverage our large scale dataset in conjunction with powerful machine learning techniques and computer vision features, to introduce novel dynamic eye movement prediction methods which learn task-sensitive reward functions from eye movement data and efficiently integrate these rewards to plan future saccades based on inverse optimal control.

Probabilistic Joint Image Segmentation and Labeling

no code implementations NeurIPS 2011 Adrian Ion, Joao Carreira, Cristian Sminchisescu

We present a joint image segmentation and labeling model (JSL) which, given a bag of figure-ground segment hypotheses extracted at multiple image locations and scales, constructs a joint probability distribution over both the compatible image interpretations (tilings or image segmentations) composed from those segments, and over their labeling into categories.

Image Segmentation Segmentation +1

Twin gaussian processes for structured prediction

no code implementations International Journal of Computer Vision 2010 Liefeng Bo, Cristian Sminchisescu

We describe twin Gaussian processes (TGP), a generic structured prediction method that uses Gaussian process (GP) priors on both covariates and responses, both multivariate, and estimates outputs by minimizing the Kullback-Leibler divergence between two GP modeled as normal distributions over finite index sets of training and testing examples, emphasizing the goal that similar inputs should produce similar percepts and this should hold, on average, between their marginal distributions.

3D Human Pose Estimation Camera Calibration +2

Efficient Match Kernel between Sets of Features for Visual Recognition

no code implementations NeurIPS 2009 Liefeng Bo, Cristian Sminchisescu

To address this problem, we propose an efficient match kernel (EMK), which maps local features to a low dimensional feature space, average the resulting feature vectors to form a set-level feature, then apply a linear classifier.

Quantization

People Tracking with the Laplacian Eigenmaps Latent Variable Model

no code implementations NeurIPS 2007 Zhengdong Lu, Cristian Sminchisescu, Miguel Á. Carreira-Perpiñán

Reliably recovering 3D human pose from monocular video requires constraints that bias the estimates towards typical human poses and motions.

Dimensionality Reduction

Cannot find the paper you are looking for? You can Submit a new open access paper.