no code implementations • 22 Nov 2024 • Teodor Alexandru Szente, James Harrison, Mihai Zanfir, Cristian Sminchisescu
Fractional gradient descent has been studied extensively, with a focus on its ability to extend traditional gradient descent methods by incorporating fractional-order derivatives.
no code implementations • 15 Oct 2024 • Mykhaylo Andriluka, Baruch Tabanpour, C. Daniel Freeman, Cristian Sminchisescu
We propose a novel neural network approach, LARP (Learned Articulated Rigid body Physics), to model the dynamics of articulated human motion with contact.
no code implementations • 11 Jun 2024 • Nikos Kolotouros, Thiemo Alldieck, Enric Corona, Eduard Gabriel Bazavan, Cristian Sminchisescu
We present AvatarPopUp, a method for fast, high quality 3D human avatar generation from different input modalities, such as images and text prompts and with control over the generated pose and shape.
no code implementations • 13 Mar 2024 • Enric Corona, Andrei Zanfir, Eduard Gabriel Bazavan, Nikos Kolotouros, Thiemo Alldieck, Cristian Sminchisescu
We propose VLOGGER, a method for audio-driven human video generation from a single input image of a person, which builds on the success of recent generative diffusion models.
no code implementations • 10 Jan 2024 • Thiemo Alldieck, Nikos Kolotouros, Cristian Sminchisescu
Score Distillation Sampling (SDS) is a recent but already widely popular method that relies on an image diffusion model to control optimization problems using text prompts.
no code implementations • CVPR 2024 • Akash Sengupta, Thiemo Alldieck, Nikos Kolotouros, Enric Corona, Andrei Zanfir, Cristian Sminchisescu
Our experiments show that DiffHuman can produce diverse and detailed reconstructions for the parts of the person that are unseen or uncertain in the input image while remaining competitive with the state-of-the-art when reconstructing visible surfaces.
no code implementations • 6 Dec 2023 • Maria Priisalu, Ted Kronvall, Cristian Sminchisescu
Human pose forecasting is the task of predicting articulated human motion given past human motion.
no code implementations • 4 Nov 2023 • Eduard Gabriel Bazavan, Andrei Zanfir, Thiemo Alldieck, Teodor Alexandru Szente, Mihai Zanfir, Cristian Sminchisescu
We present \emph{SPHEAR}, an accurate, differentiable parametric statistical 3D human head model, enabled by a novel 3D registration method based on spherical embeddings.
no code implementations • 11 Sep 2023 • Ivan Grishchenko, Geng Yan, Eduard Gabriel Bazavan, Andrei Zanfir, Nikolai Chinaev, Karthik Raveendran, Matthias Grundmann, Cristian Sminchisescu
We present Blendshapes GHUM, an on-device ML pipeline that predicts 52 facial blendshape coefficients at 30+ FPS on modern mobile phones, from a single monocular RGB image and enables facial motion capture applications like virtual avatars.
1 code implementation • 3 Aug 2023 • Mihai Fieraru, Mihai Zanfir, Elisabeta Oneata, Alin-Ionut Popa, Vlad Olaru, Cristian Sminchisescu
Understanding 3d human interactions is fundamental for fine-grained scene analysis and behavioural modeling.
no code implementations • 15 Dec 2022 • Andrei Zanfir, Mihai Zanfir, Alexander Gorban, Jingwei Ji, Yin Zhou, Dragomir Anguelov, Cristian Sminchisescu
Autonomous driving is an exciting new industry, posing important research questions.
no code implementations • 14 Dec 2022 • Mihai Zanfir, Thiemo Alldieck, Cristian Sminchisescu
We present PhoMoH, a neural network methodology to construct generative models of photo-realistic 3D geometry and appearance of human heads including hair, beards, an oral cavity, and clothing.
no code implementations • CVPR 2023 • Enric Corona, Mihai Zanfir, Thiemo Alldieck, Eduard Gabriel Bazavan, Andrei Zanfir, Cristian Sminchisescu
We introduce Structured 3D Features, a model based on a novel implicit 3D representation that pools pixel-aligned image features onto dense 3D points sampled from a parametric, statistical human mesh surface.
no code implementations • CVPR 2023 • Erik Gärtner, Luke Metz, Mykhaylo Andriluka, C. Daniel Freeman, Cristian Sminchisescu
We propose a new approach to learned optimization where we represent the computation of an optimizer's update step using a neural network.
no code implementations • 23 Jun 2022 • Ivan Grishchenko, Valentin Bazarevsky, Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, Richard Yee, Karthik Raveendran, Matsvei Zhdanovich, Matthias Grundmann, Cristian Sminchisescu
We present BlazePose GHUM Holistic, a lightweight neural network pipeline for 3D human body landmarks and pose estimation, specifically tailored to real-time on-device inference.
no code implementations • CVPR 2022 • Erik Gärtner, Mykhaylo Andriluka, Erwin Coumans, Cristian Sminchisescu
We introduce DiffPhy, a differentiable physics-based model for articulated 3d human motion reconstruction from video.
Ranked #58 on
3D Human Pose Estimation
on Human3.6M
no code implementations • CVPR 2022 • Erik Gärtner, Mykhaylo Andriluka, Hongyi Xu, Cristian Sminchisescu
We focus on the task of estimating a physically plausible articulated human motion from monocular video.
Ranked #320 on
3D Human Pose Estimation
on Human3.6M
no code implementations • CVPR 2022 • Thiemo Alldieck, Mihai Zanfir, Cristian Sminchisescu
We present PHORHUM, a novel, end-to-end trainable, deep neural network methodology for photorealistic 3D human reconstruction given just a monocular RGB image.
1 code implementation • CVPR 2022 • Bharat Lal Bhatnagar, Xianghui Xie, Ilya A. Petrov, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll
We present BEHAVE dataset, the first full body human- object interaction dataset with multi-view RGBD frames and corresponding 3D SMPL and object fits along with the annotated contacts between them.
1 code implementation • 7 Apr 2022 • Erik Sandström, Martin R. Oswald, Suryansh Kumar, Silvan Weder, Fisher Yu, Cristian Sminchisescu, Luc van Gool
Multi-sensor depth fusion is able to substantially improve the robustness and accuracy of 3D reconstruction methods, but existing techniques are not robust enough to handle sensors which operate with diverse value ranges as well as noise and outlier statistics.
1 code implementation • 2 Mar 2022 • Henning Petzka, Ted Kronvall, Cristian Sminchisescu
By reusing the discriminator network to modify the metric on the latent space, we propose a lightweight solution for improved interpolations in pre-trained GANs.
no code implementations • 28 Dec 2021 • David Nilsson, Aleksis Pirinen, Erik Gärtner, Cristian Sminchisescu
As we study this task in a lifelong learning context, the agents should use knowledge gained in earlier visited environments in order to guide their exploration and active learning strategy in successively visited buildings.
no code implementations • 23 Dec 2021 • Eduard Gabriel Bazavan, Andrei Zanfir, Mihai Zanfir, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu
We combine a hundred diverse individuals of varying ages, gender, proportions, and ethnicity, with hundreds of motions and scenes, as well as parametric variations in body shape (for a total of 1, 600 different humans), in order to generate an initial dataset of over 1 million frames.
Ranked #1 on
3D Human Pose Estimation
on HSPACE
no code implementations • NeurIPS 2021 • Mihai Fieraru, Mihai Zanfir, Teodor Szente, Eduard Bazavan, Vlad Olaru, Cristian Sminchisescu
We introduce a novel unified model for self- and interpenetration-collisions based on a mesh approximation computed by applying decimation operators.
no code implementations • NeurIPS 2021 • Hongyi Xu, Thiemo Alldieck, Cristian Sminchisescu
This allows us to robustly fuse information from sparse views and generalize well beyond the poses or views observed in training.
1 code implementation • ICCV 2021 • Thiemo Alldieck, Hongyi Xu, Cristian Sminchisescu
We present imGHUM, the first holistic generative model of 3D human shape and articulated pose, represented as a signed distance function.
1 code implementation • 11 Aug 2021 • Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstrom, Cristian Sminchisescu, Luc van Gool
This paper presents a real-time online vision framework to jointly recover an indoor scene's 3D structure and semantic label.
no code implementations • CVPR 2021 • Pei Sun, Weiyue Wang, Yuning Chai, Gamaleldin Elsayed, Alex Bewley, Xiao Zhang, Cristian Sminchisescu, Dragomir Anguelov
These larger detection ranges require more efficient and accurate detection models.
no code implementations • CVPR 2021 • Mihai Fieraru, Mihai Zanfir, Silviu Cristian Pirlea, Vlad Olaru, Cristian Sminchisescu
AIFit is able to reconstruct 3d human pose and motion, reliably segment exercise repetitions, and identify in real-time the deviations between standards learnt from trainers, and the execution of a trainee.
no code implementations • ICCV 2021 • Mihai Zanfir, Andrei Zanfir, Eduard Gabriel Bazavan, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu
We present THUNDR, a transformer-based deep neural network methodology to reconstruct the 3d pose and shape of people, given monocular RGB images.
Ranked #50 on
3D Human Pose Estimation
on 3DPW
(MPJPE metric)
1 code implementation • 6 Apr 2021 • Jack Valmadre, Alex Bewley, Jonathan Huang, Chen Sun, Cristian Sminchisescu, Cordelia Schmid
This paper introduces temporally local metrics for Multi-Object Tracking.
no code implementations • ICLR 2021 • Martin Trimmel, Henning Petzka, Cristian Sminchisescu
Deep neural networks with rectified linear (ReLU) activations are piecewise linear functions, where hyperplanes partition the input space into an astronomically high number of linear regions.
no code implementations • 18 Dec 2020 • Mihai Fieraru, Mihai Zanfir, Elisabeta Oneata, Alin-Ionut Popa, Vlad Olaru, Cristian Sminchisescu
Monocular estimation of three dimensional human self-contact is fundamental for detailed scene analysis including body language understanding and behaviour modeling.
no code implementations • 17 Dec 2020 • David Nilsson, Aleksis Pirinen, Erik Gärtner, Cristian Sminchisescu
We study the task of embodied visual active learning, where an agent is set to explore a 3d environment with the goal to acquire visual scene understanding by actively selecting views for which to request annotation.
no code implementations • NeurIPS 2020 • Bharat Lal Bhatnagar, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll
Formulating this closed loop is not straightforward because it is not trivial to force the output of the NN to be on the surface of the human model - outside this surface the human model is not even defined.
no code implementations • CVPR 2021 • Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
Ranked #71 on
3D Human Pose Estimation
on 3DPW
(MPJPE metric)
1 code implementation • ECCV 2020 • Bharat Lal Bhatnagar, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll
In this work, we present methodology that combines detail-rich implicit functions and parametric representations in order to reconstruct 3D models of people that remain controllable and accurate even in the presence of clothing.
1 code implementation • ICLR 2021 • Kartik Gupta, Amir Rahimi, Thalaiyasingam Ajanthan, Thomas Mensink, Cristian Sminchisescu, Richard Hartley
From this, by approximating the empirical cumulative distribution using a differentiable function via splines, we obtain a recalibration function, which maps the network outputs to actual (calibrated) class assignment probabilities.
no code implementations • 23 Jun 2020 • Amir Rahimi, Thomas Mensink, Kartik Gupta, Thalaiyasingam Ajanthan, Cristian Sminchisescu, Richard Hartley
Calibration of neural networks is a critical aspect to consider when incorporating machine learning models in real-world decision-making systems where the confidence of decisions are equally important as the decisions themselves.
no code implementations • ECCV 2020 • Yuhua Chen, Luc van Gool, Cordelia Schmid, Cristian Sminchisescu
To handle inherent modeling error in the consistency loss (e. g. Lambertian assumptions) and for better generalization, we further introduce a learned, output refinement network, which takes the initial predictions, the loss, and the gradient as input, and efficiently predicts a correlated output update.
1 code implementation • 20 May 2020 • Alex Bewley, Pei Sun, Thomas Mensink, Dragomir Anguelov, Cristian Sminchisescu
This paper presents a novel 3D object detection framework that processes LiDAR data directly on its native representation: range images.
no code implementations • ECCV 2020 • Andrei Zanfir, Eduard Gabriel Bazavan, Hongyi Xu, Bill Freeman, Rahul Sukthankar, Cristian Sminchisescu
Monocular 3D human pose and shape estimation is challenging due to the many degrees of freedom of the human body and thedifficulty to acquire training data for large-scale supervised learning in complex visual scenes.
Ranked #63 on
3D Human Pose Estimation
on 3DPW
(PA-MPJPE metric)
1 code implementation • 7 Jan 2020 • Erik Gärtner, Aleksis Pirinen, Cristian Sminchisescu
Most 3d human pose estimation methods assume that input -- be it images of a scene collected from one or several viewpoints, or from a video -- is given.
1 code implementation • NeurIPS 2021 • Henning Petzka, Michael Kamp, Linara Adilova, Cristian Sminchisescu, Mario Boley
Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks.
1 code implementation • NeurIPS 2019 • Aleksis Pirinen, Erik Gärtner, Cristian Sminchisescu
In order to address the view selection problem in a principled way, we here introduce ACTOR, an active triangulation agent for 3d human pose reconstruction.
no code implementations • 29 Nov 2019 • Henning Petzka, Linara Adilova, Michael Kamp, Cristian Sminchisescu
The performance of deep neural networks is often attributed to their automated, task-related feature construction.
no code implementations • 25 Sep 2019 • Henning Petzka, Linara Adilova, Michael Kamp, Cristian Sminchisescu
With this, the generalization error of a model trained on representative data can be bounded by its feature robustness which depends on our novel flatness measure.
no code implementations • 23 Sep 2019 • Mihai Zanfir, Elisabeta Oneata, Alin-Ionut Popa, Andrei Zanfir, Cristian Sminchisescu
Generating good quality and geometrically plausible synthetic images of humans with the ability to control appearance, pose and shape parameters, has become increasingly important for a variety of tasks ranging from photo editing, fashion virtual try-on, to special effects and image compression.
no code implementations • ICCV 2019 • Yuhua Chen, Cordelia Schmid, Cristian Sminchisescu
We present GLNet, a self-supervised framework for learning depth, optical flow, camera pose and intrinsic parameters from monocular video - addressing the difficulty of acquiring realistic ground-truth for such tasks.
no code implementations • 16 Dec 2018 • Henning Petzka, Cristian Sminchisescu
For extremely wide neural networks of decreasing width after the wide layer, we prove that every suboptimal local minimum belongs to such a connected set.
no code implementations • NeurIPS 2018 • Andrei Zanfir, Elisabeta Marinoiu, Mihai Zanfir, Alin-Ionut Popa, Cristian Sminchisescu
The final stage of 3d pose and shape prediction is based on a learned attention process where information from different human body parts is optimally integrated.
no code implementations • CVPR 2018 • Elisabeta Marinoiu, Mihai Zanfir, Vlad Olaru, Cristian Sminchisescu
We introduce new, fine-grained action and emotion recognition tasks defined on non-staged videos, recorded during robot-assisted therapy sessions of children with autism.
no code implementations • CVPR 2018 • Mihai Zanfir, Alin-Ionut Popa, Andrei Zanfir, Cristian Sminchisescu
We propose an automatic person-to-person appearance transfer model based on explicit parametric 3d human representations and learned, constrained deep translation network architectures for photographic image synthesis.
no code implementations • CVPR 2018 • Andrei Zanfir, Elisabeta Marinoiu, Cristian Sminchisescu
Human sensing has greatly benefited from recent advances in deep learning, parametric human modeling, and large scale 2d and 3d datasets.
1 code implementation • CVPR 2018 • Aleksis Pirinen, Cristian Sminchisescu
We propose drl-RPN, a deep reinforcement learning-based visual recognition model consisting of a sequential region proposal network (RPN) and an object detector.
no code implementations • CVPR 2018 • Andrei Zanfir, Cristian Sminchisescu
The problem of graph matching under node and pair-wise constraints is fundamental in areas as diverse as combinatorial optimization, machine learning or computer vision, where representing both the relations between nodes and their neighborhood structure is essential.
Ranked #19 on
Graph Matching
on Willow Object Class
no code implementations • CVPR 2017 • Alin-Ionut Popa, Mihai Zanfir, Cristian Sminchisescu
We propose a deep multitask architecture for \emph{fully automatic 2d and 3d human sensing} (DMHS), including \emph{recognition and reconstruction}, in \emph{monocular images}.
Ranked #23 on
3D Human Pose Estimation
on HumanEva-I
no code implementations • CVPR 2018 • David Nilsson, Cristian Sminchisescu
In this paper we present a deep, end-to-end trainable methodology to video segmentation that is capable of leveraging information present in unlabeled data in order to improve semantic estimates.
Ranked #6 on
Video Semantic Segmentation
on CamVid
no code implementations • 17 Oct 2016 • Mihai Zanfir, Elisabeta Marinoiu, Cristian Sminchisescu
Automatic video captioning is challenging due to the complex interactions in dynamic real scenes.
no code implementations • CVPR 2016 • Stefan Mathe, Aleksis Pirinen, Cristian Sminchisescu
One of the most widely used strategies for visual object detection is based on exhaustive spatial hypothesis search.
no code implementations • ICCV 2015 • Andrei Zanfir, Cristian Sminchisescu
In this paper we propose a novel coarse to fine correspondence-based scene flow approach to account for the effects of large displacements and to model occlusion, based on explicit geometric reasoning.
no code implementations • ICCV 2015 • Catalin Ionescu, Orestis Vantzos, Cristian Sminchisescu
Deep neural network architectures have recently produced excellent results in a variety of areas in artificial intelligence and visual recognition, well surpassing traditional shallow architectures trained using hand-designed features.
1 code implementation • 25 Sep 2015 • Catalin Ionescu, Orestis Vantzos, Cristian Sminchisescu
Deep neural network architectures have recently produced excellent results in a variety of areas in artificial intelligence and visual recognition, well surpassing traditional shallow architectures trained using hand-designed features.
no code implementations • 20 Sep 2015 • Vlad Olaru, Mihai Florea, Cristian Sminchisescu
This paper presents a framework that supports the implementation of parallel solutions for the widespread parametric maximum flow computational routines used in image segmentation algorithms.
no code implementations • CVPR 2015 • Dan Banica, Cristian Sminchisescu
We focus on the problem of semantic segmentation based on RGB-D data, with emphasis on analyzing cluttered indoor scenes containing many visual categories and instances.
no code implementations • 5 Feb 2015 • Tom Lee, Sanja Fidler, Alex Levinshtein, Cristian Sminchisescu, Sven Dickinson
The role of symmetry in computer vision has waxed and waned in importance during the evolution of the field from its earliest days.
no code implementations • 27 Jan 2015 • Alin-Ionut Popa, Cristian Sminchisescu
The figure-ground segmentation of humans in images captured in natural environments is an outstanding open problem due to the presence of complex backgrounds, articulation, varying body proportions, partial views and viewpoint changes.
no code implementations • 29 Nov 2014 • Stefan Mathe, Cristian Sminchisescu
State-of-the-art visual recognition and detection systems increasingly rely on large amounts of training data and complex classifiers.
no code implementations • CVPR 2014 • Catalin Ionescu, Joao Carreira, Cristian Sminchisescu
Recently, the emergence of Kinect systems has demonstrated the benefits of predicting an intermediate body part labeling for 3D human pose estimation, in conjunction with RGB-D imagery.
no code implementations • 30 Dec 2013 • Dan Banica, Cristian Sminchisescu
Our contributions can be summarized as proposing the following: (1) a generalization of parametric max flow figure-ground proposal methodology to take advantage of intensity and depth information, in order to systematically and efficiently generate the breakpoints of an underlying spatial model in polynomial time, (2) new region description methods based on second-order pooling over multiple features constructed using both intensity and depth channels, (3) an inference procedure that can resolve conflicts in overlapping spatial partitions, and handles scenes with a large number of objects category instances, of very different scales, (4) extensive evaluation of the impact of depth, as well as the effectiveness of a large number of descriptors, both pre-designed and automatically obtained using deep learning, in a difficult RGB-D semantic segmentation problem with 92 classes.
no code implementations • 29 Dec 2013 • Stefan Mathe, Cristian Sminchisescu
Systems based on bag-of-words models from image features collected at maxima of sparse interest point operators have been used successfully for both computer visual object and action recognition tasks.
1 code implementation • IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 36 , Issue: 7 , July 2014 ) 2013 • Catalin Ionescu, Dragos Papava, Vlad Olaru, Cristian Sminchisescu
We introduce a new dataset, Human3. 6M, of 3. 6 Million accurate 3D Human poses, acquired by recording the performance of 5 female and 6 male subjects, under 4 different viewpoints, for training realistic human sensing systems and for evaluating the next generation of human pose estimation models and algorithms.
no code implementations • NeurIPS 2013 • Stefan Mathe, Cristian Sminchisescu
Finally, we leverage our large scale dataset in conjunction with powerful machine learning techniques and computer vision features, to introduce novel dynamic eye movement prediction methods which learn task-sensitive reward functions from eye movement data and efficiently integrate these rewards to plan future saccades based on inverse optimal control.
no code implementations • CVPR 2013 • Fuxin Li, Joao Carreira, Guy Lebanon, Cristian Sminchisescu
In this paper we present an inference procedure for the semantic segmentation of images.
no code implementations • 18 Jun 2012 • Fuxin Li, Guy Lebanon, Cristian Sminchisescu
We propose a new analytical approximation to the $\chi^2$ kernel that converges geometrically.
no code implementations • NeurIPS 2011 • Adrian Ion, Joao Carreira, Cristian Sminchisescu
We present a joint image segmentation and labeling model (JSL) which, given a bag of figure-ground segment hypotheses extracted at multiple image locations and scales, constructs a joint probability distribution over both the compatible image interpretations (tilings or image segmentations) composed from those segments, and over their labeling into categories.
no code implementations • NeurIPS 2010 • Fuxin Li, Cristian Sminchisescu
In this work, we propose an approach that recasts it as a convex likelihood ratio estimation problem.
no code implementations • International Journal of Computer Vision 2010 • Liefeng Bo, Cristian Sminchisescu
We describe twin Gaussian processes (TGP), a generic structured prediction method that uses Gaussian process (GP) priors on both covariates and responses, both multivariate, and estimates outputs by minimizing the Kullback-Leibler divergence between two GP modeled as normal distributions over finite index sets of training and testing examples, emphasizing the goal that similar inputs should produce similar percepts and this should hold, on average, between their marginal distributions.
Ranked #25 on
3D Human Pose Estimation
on HumanEva-I
no code implementations • NeurIPS 2009 • Liefeng Bo, Cristian Sminchisescu
To address this problem, we propose an efficient match kernel (EMK), which maps local features to a low dimensional feature space, average the resulting feature vectors to form a set-level feature, then apply a linear classifier.
no code implementations • NeurIPS 2007 • Zhengdong Lu, Cristian Sminchisescu, Miguel Á. Carreira-Perpiñán
Reliably recovering 3D human pose from monocular video requires constraints that bias the estimates towards typical human poses and motions.