On discrete symmetries of robotics systems: A group-theoretic and data-driven analysis

1 code implementation21 Feb 2023 Daniel Ordonez-Apraez, Mario Martin, Antonio Agudo, Francesc Moreno-Noguer

We present a comprehensive study on discrete morphological symmetries of dynamical systems, which are commonly observed in biological and artificial locomoting systems, such as legged, swimming, and flying animals/robots/virtual characters.

Contact Detection Data Augmentation +1

Visual Semantic Relatedness Dataset for Image Captioning

1 code implementation20 Jan 2023 Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró

Modern image captioning system relies heavily on extracting knowledge from images to capture the concept of a static story.

Image Captioning text similarity

NeRFLight: Fast and Light Neural Radiance Fields Using a Shared Feature Grid

no code implementations CVPR 2023 Fernando Rivas-Manzaneque, Jorge Sierra-Acosta, Adrian Penate-Sanchez, Francesc Moreno-Noguer, Angela Ribeiro

While original Neural Radiance Fields (NeRF) have shown impressive results in modeling the appearance of a scene with compact MLP architectures, they are not able to achieve real-time rendering.

PoseScript: 3D Human Poses from Natural Language

1 code implementation21 Oct 2022 Ginger Delmas, Philippe Weinzaepfel, Thomas Lucas, Francesc Moreno-Noguer, Grégory Rogez

This process extracts low-level pose information -- the posecodes -- using a set of simple but generic rules on the 3D keypoints.

Cross-Modal Retrieval Image Captioning +3

SIRA: Relightable Avatars from a Single Image

no code implementations7 Sep 2022 Pol Caselles, Eduard Ramon, Jaime Garcia, Xavier Giro-i-Nieto, Francesc Moreno-Noguer, Gil Triginer

Our key ingredients are two data-driven statistical models based on neural fields that resolve the ambiguities of single-view 3D surface reconstruction and appearance factorization.

Surface Reconstruction

Topic Detection in Continuous Sign Language Videos

1 code implementation1 Sep 2022 Alvaro Budria, Laia Tarres, Gerard I. Gallego, Francesc Moreno-Noguer, Jordi Torres, Xavier Giro-i-Nieto

Significant progress has been made recently on challenging tasks in automatic sign language understanding, such as sign language recognition, translation and production.

Sign Language Recognition Translation

Back to MLP: A Simple Baseline for Human Motion Prediction

1 code implementation4 Jul 2022 Wen Guo, Yuming Du, Xi Shen, Vincent Lepetit, Xavier Alameda-Pineda, Francesc Moreno-Noguer

This paper tackles the problem of human motion prediction, consisting in forecasting future body poses from historically observed sequences.

Human motion prediction Human Pose Forecasting +1

Learned Vertex Descent: A New Direction for 3D Human Model Fitting

no code implementations12 May 2022 Enric Corona, Gerard Pons-Moll, Guillem Alenyà, Francesc Moreno-Noguer

An exhaustive evaluation demonstrates that our approach is able to capture the underlying body of clothed people with very different body shapes, achieving a significant improvement compared to state-of-the-art.

Single-view 3D Body and Cloth Reconstruction under Complex Poses

no code implementations9 May 2022 Nicolas Ugrinovic, Albert Pumarola, Alberto Sanfeliu, Francesc Moreno-Noguer

We, therefore, propose a coarse-to-fine approach in which we first learn an implicit function that maps the input image to a 3D body shape with a low level of detail, but which correctly fits the underlying human pose, despite its complexity.

3D Reconstruction

Conditional-Flow NeRF: Accurate 3D Modelling with Reliable Uncertainty Quantification

1 code implementation18 Mar 2022 Jianxiong Shen, Antonio Agudo, Francesc Moreno-Noguer, Adria Ruiz

For this purpose, our method learns a distribution over all possible radiance fields modelling which is used to quantify the uncertainty associated with the modelled scene.

Autonomous Driving Decision Making +1

Enhancing Egocentric 3D Pose Estimation with Third Person Views

1 code implementation6 Jan 2022 Ameya Dhamanaskar, Mariella Dimiccoli, Enric Corona, Albert Pumarola, Francesc Moreno-Noguer

In this paper, we propose a novel approach to enhance the 3D body pose estimation of a person computed from videos captured from a single wearable camera.

3D Pose Estimation Domain Adaptation

PhysXNet: A Customizable Approach for LearningCloth Dynamics on Dressed People

no code implementations13 Nov 2021 Jordi Sanchez-Riera, Albert Pumarola, Francesc Moreno-Noguer

We introduce PhysXNet, a learning-based approach to predict the dynamics of deformable clothes given 3D skeleton motion sequences of humans wearing these clothes.

An Adaptable Approach to Learn Realistic Legged Locomotion without Examples

no code implementations28 Oct 2021 Daniel Ordonez-Apraez, Antonio Agudo, Francesc Moreno-Noguer, Mario Martin

We present experimental results showing that even in a model-free setup and with a simple reactive control architecture, the learned policies can generate realistic and energy-efficient locomotion gaits for a bipedal and a quadrupedal robot.

Reinforcement Learning (RL)

Grasp-Oriented Fine-grained Cloth Segmentation without Real Supervision

no code implementations6 Oct 2021 Ruijie Ren, Mohit Gurnani Rajesh, Jordi Sanchez-Riera, Fan Zhang, Yurun Tian, Antonio Agudo, Yiannis Demiris, Krystian Mikolajczyk, Francesc Moreno-Noguer

We show that training our network solely with synthetic data and the proposed DA yields results competitive with models trained on real data.

Domain Adaptation

Stochastic Neural Radiance Fields: Quantifying Uncertainty in Implicit 3D Representations

no code implementations5 Sep 2021 Jianxiong Shen, Adria Ruiz, Antonio Agudo, Francesc Moreno-Noguer

In this context, we propose Stochastic Neural Radiance Fields (S-NeRF), a generalization of standard NeRF that learns a probability distribution over all the possible radiance fields modeling the scene.

Novel View Synthesis Variational Inference

SIDER: Single-Image Neural Optimization for Facial Geometric Detail Recovery

no code implementations11 Aug 2021 Aggelina Chatziagapi, ShahRukh Athar, Francesc Moreno-Noguer, Dimitris Samaras

We present SIDER(Single-Image neural optimization for facial geometric DEtail Recovery), a novel photometric optimization method that recovers detailed facial geometry from a single image in an unsupervised manner.

H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

1 code implementation ICCV 2021 Eduard Ramon, Gil Triginer, Janna Escur, Albert Pumarola, Jaime Garcia, Xavier Giro-i-Nieto, Francesc Moreno-Noguer

In this paper, we tackle these limitations for the specific problem of few-shot full 3D head reconstruction, by endowing coordinate-based representations with a probabilistic shape prior that enables faster convergence and better generalization when using few input images (down to three).

3D Reconstruction Vocal Bursts Intensity Prediction

Uncertainty-Aware Camera Pose Estimation from Points and Lines

no code implementations CVPR 2021 Alexander Vakhitov, Luis Ferraz Colomina, Antonio Agudo, Francesc Moreno-Noguer

The new PnP(L) methods outperform the state-of-the-art on real data in isolation, showing an increase in mean translation accuracy by 18% on a representative subset of KITTI, while the new uncertain refinement improves pose accuracy for most of the solvers, e. g. decreasing mean translation error for the EPnP by 16% compared to the standard refinement on the same dataset.

Camera Localization Pose Estimation +2

Multi-Person Extreme Motion Prediction

1 code implementation CVPR 2022 Wen Guo, Xiaoyu Bie, Xavier Alameda-Pineda, Francesc Moreno-Noguer

In this paper, we explore this problem when dealing with humans performing collaborative tasks, we seek to predict the future motion of two interacted persons given two sequences of their past skeletons.

Human motion prediction motion prediction +2

3D Human Pose, Shape and Texture from Low-Resolution Images and Videos

2 code implementations11 Mar 2021 Xiangyu Xu, Hao Chen, Francesc Moreno-Noguer, Laszlo A. Jeni, Fernando de la Torre

Two common approaches to deal with low-resolution images are applying super-resolution techniques to the input, which may result in unpleasant artifacts, or simply training one model for each resolution, which is impractical in many realistic applications.

3D human pose and shape estimation Contrastive Learning +1

SMPLicit: Topology-aware Generative Model for Clothed People

1 code implementation CVPR 2021 Enric Corona, Albert Pumarola, Guillem Alenyà, Gerard Pons-Moll, Francesc Moreno-Noguer

In this paper we introduce SMPLicit, a novel generative model to jointly represent body pose, shape and clothing geometry.

3D Reconstruction

Multi-FinGAN: Generative Coarse-To-Fine Sampling of Multi-Finger Grasps

1 code implementation17 Dec 2020 Jens Lundell, Enric Corona, Tran Nguyen Le, Francesco Verdoja, Philippe Weinzaepfel, Gregory Rogez, Francesc Moreno-Noguer, Ville Kyrki

While there exists many methods for manipulating rigid objects with parallel-jaw grippers, grasping with multi-finger robotic hands remains a quite unexplored research topic.

FaceDet3D: Facial Expressions with 3D Geometric Detail Prediction

no code implementations14 Dec 2020 ShahRukh Athar, Albert Pumarola, Francesc Moreno-Noguer, Dimitris Samaras

The facial details are represented as a vertex displacement map and used then by a Neural Renderer to photo-realistically render novel images of any single image in any desired expression and view.

D-NeRF: Neural Radiance Fields for Dynamic Scenes

1 code implementation CVPR 2021 Albert Pumarola, Enric Corona, Gerard Pons-Moll, Francesc Moreno-Noguer

In this paper we introduce D-NeRF, a method that extends neural radiance fields to a dynamic domain, allowing to reconstruct and render novel images of objects under rigid and non-rigid motions from a \emph{single} camera moving around the scene.

Neural Rendering

PI-Net: Pose Interacting Network for Multi-Person Monocular 3D Pose Estimation

no code implementations11 Oct 2020 Wen Guo, Enric Corona, Francesc Moreno-Noguer, Xavier Alameda-Pineda

Our pose interacting network, or PI-Net, inputs the initial pose estimates of a variable number of interactees into a recurrent architecture used to refine the pose of the person-of-interest.

3D Multi-Person Pose Estimation (root-relative) 3D Pose Estimation

3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning

2 code implementations ECCV 2020 Xiangyu Xu, Hao Chen, Francesc Moreno-Noguer, Laszlo A. Jeni, Fernando de la Torre

3D human shape and pose estimation from monocular images has been an active area of research in computer vision, having a substantial impact on the development of new applications, from activity recognition to creating virtual avatars.

3D Human Pose Estimation 3D Shape Reconstruction +4

Neural Cellular Automata Manifold

no code implementations CVPR 2021 Alejandro Hernandez Ruiz, Armand Vilalta, Francesc Moreno-Noguer

In biological terms, our approach would play the role of the transcription factors, modulating the mapping of genes into specific proteins that drive cellular differentiation, which occurs right before the morphogenesis.

Image Generation

Textual Visual Semantic Dataset for Text Spotting

no code implementations21 Apr 2020 Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró

In this paper, we propose a visual context dataset for Text Spotting in the wild, where the publicly available dataset COCO-text [Veit et al. 2016] has been extended with information about the scene (such as objects and places appearing in the image) to enable researchers to include semantic relations between texts and scene in their Text Spotting systems, and to offer a common framework for such approaches.

text similarity Text Spotting

C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds

no code implementations CVPR 2020 Albert Pumarola, Stefan Popov, Francesc Moreno-Noguer, Vittorio Ferrari

Flow-based generative models have highly desirable properties like exact log-likelihood evaluation and exact latent-variable inference, however they are still in their infancy and have not received as much attention as alternative generative models.

3D Reconstruction Image Manipulation +1

Improving Map Re-localization with Deep 'Movable' Objects Segmentation on 3D LiDAR Point Clouds

no code implementations8 Oct 2019 Victor Vaquero, Kai Fischer, Francesc Moreno-Noguer, Alberto Sanfeliu, Stefan Milz

Results show that we are able to accurately re-locate over a filtered map, consistently reducing trajectory errors between an average of 35. 1% with respect to a non-filtered map version and of 47. 9% with respect to a standalone map created on the current session.

Autonomous Vehicles

3DPeople: Modeling the Geometry of Dressed Humans

no code implementations ICCV 2019 Albert Pumarola, Jordi Sanchez, Gary P. T. Choi, Alberto Sanfeliu, Francesc Moreno-Noguer

Finally, we design a multi-resolution deep generative network that, given an input image of a dressed human, predicts his/her geometry image (and thus the clothed body shape) in an end-to-end manner.

3D Human Shape Estimation Optical Flow Estimation

Context-aware Human Motion Prediction

no code implementations CVPR 2020 Enric Corona, Albert Pumarola, Guillem Alenyà, Francesc Moreno-Noguer

The problem of predicting human motion given a sequence of past observations is at the core of many applications in robotics and computer vision.

Graph Attention Human motion prediction +1

Fast video object segmentation with Spatio-Temporal GANs

no code implementations28 Mar 2019 Sergi Caelles, Albert Pumarola, Francesc Moreno-Noguer, Alberto Sanfeliu, Luc van Gool

To achieve this, we concentrate all the heavy computational load to the training phase with two critics that enforce spatial and temporal mask consistency over the last K frames.

Descriptive One-shot visual object segmentation +3

Human Motion Prediction via Spatio-Temporal Inpainting

1 code implementation13 Dec 2018 Alejandro Hernandez Ruiz, Juergen Gall, Francesc Moreno-Noguer

First, we represent the data using a spatio-temporal tensor of 3D skeleton coordinates which allows formulating the prediction problem as an inpainting one, for which GANs work particularly well.

Human motion prediction Motion Forecasting +1

Visual Re-ranking with Natural Language Understanding for Text Spotting

3 code implementations29 Oct 2018 Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró

We propose a post-processing approach to improve scene text recognition accuracy by using occurrence probabilities of words (unigram language model), and the semantic correlation between scene and text.

Language Modelling Natural Language Understanding +3

Visual Semantic Re-ranker for Text Spotting

1 code implementation23 Oct 2018 Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró

In this paper, we propose a post-processing approach to improve the accuracy of text spotting by using the semantic relation between the text and the scene.

Text Spotting

Unsupervised Person Image Synthesis in Arbitrary Poses

no code implementations CVPR 2018 Albert Pumarola, Antonio Agudo, Alberto Sanfeliu, Francesc Moreno-Noguer

Given an input image of a person and a desired pose represented by a 2D skeleton, our model renders the image of the same person under the new pose, synthesizing novel views of the parts visible in the input image and hallucinating those that are not seen.

Image Generation

Hallucinating Dense Optical Flow from Sparse Lidar for Autonomous Vehicles

no code implementations30 Aug 2018 Victor Vaquero, Alberto Sanfeliu, Francesc Moreno-Noguer

In this paper we propose a novel approach to estimate dense optical flow from sparse lidar data acquired on an autonomous vehicle.

Autonomous Vehicles Optical Flow Estimation

Deep Lidar CNN to Understand the Dynamics of Moving Vehicles

no code implementations28 Aug 2018 Victor Vaquero, Alberto Sanfeliu, Francesc Moreno-Noguer

Perception technologies in Autonomous Driving are experiencing their golden age due to the advances in Deep Learning.

Autonomous Driving Optical Flow Estimation

Deconvolutional Networks for Point-Cloud Vehicle Detection and Tracking in Driving Scenarios

no code implementations23 Aug 2018 Victor Vaquero, Ivan del Pino, Francesc Moreno-Noguer, Joan Solà, Alberto Sanfeliu, Juan Andrade-Cetto

The system is thoroughly evaluated on the KITTI tracking dataset, and we show the performance boost provided by our CNN-based vehicle detector over a standard geometric approach.

Autonomous Driving Multi-Object Tracking

Joint Coarse-And-Fine Reasoning for Deep Optical Flow

no code implementations22 Aug 2018 Victor Vaquero, German Ros, Francesc Moreno-Noguer, Antonio M. Lopez, Alberto Sanfeliu

We propose a novel representation for dense pixel-wise estimation tasks using CNNs that boosts accuracy and reduces training time, by explicitly exploiting joint coarse-and-fine reasoning.

Optical Flow Estimation

Image Collection Pop-Up: 3D Reconstruction and Clustering of Rigid and Non-Rigid Categories

no code implementations CVPR 2018 Antonio Agudo, Melcior Pijoan, Francesc Moreno-Noguer

This paper introduces an approach to simultaneously estimate 3D shape, camera pose, and object and type of deformation clustering, from partial 2D annotations in a multi-instance collection of images.

3D Reconstruction

DUST: Dual Union of Spatio-Temporal Subspaces for Monocular Multiple Object 3D Reconstruction

no code implementations CVPR 2017 Antonio Agudo, Francesc Moreno-Noguer

We present an approach to reconstruct the 3D shape of multiple deforming objects from incomplete 2D trajectories acquired by a single camera.

3D Reconstruction

The BreakingNews Dataset

no code implementations WS 2017 Arnau Ramisa, Fei Yan, Francesc Moreno-Noguer, Krystian Mikolajczyk

We present BreakingNews, a novel dataset with approximately 100K news articles including images, text and captions, and enriched with heterogeneous meta-data (e. g. GPS coordinates and popularity metrics).

Image Captioning Sentiment Analysis

3D Human Pose Estimation from a Single Image via Distance Matrix Regression

no code implementations CVPR 2017 Francesc Moreno-Noguer

We follow a standard two-step pipeline by first detecting the 2D position of the $N$ body joints, and then using these observations to infer 3D pose.

3D Human Pose Estimation regression

BreakingNews: Article Annotation by Image and Text Processing

no code implementations23 Mar 2016 Arnau Ramisa, Fei Yan, Francesc Moreno-Noguer, Krystian Mikolajczyk

Deep Canonical Correlation Analysis is deployed for article illustration, and a new loss function based on Great Circle Distance is proposed for geolocation.

Image Retrieval Retrieval +1

Discriminative Learning of Deep Convolutional Feature Point Descriptors

1 code implementation ICCV 2015 Edgar Simo-Serra, Eduard Trulls, Luis Ferraz, Iasonas Kokkinos, Pascal Fua, Francesc Moreno-Noguer

Deep learning has revolutionalized image-level tasks such as classification, but patch-level tasks, such as correspondence, still rely on hand-crafted features, e. g. SIFT.

Satellite Image Classification

Learning Shape, Motion and Elastic Models in Force Space

no code implementations ICCV 2015 Antonio Agudo, Francesc Moreno-Noguer

In this paper, we address the problem of simultaneously recovering the 3D shape and pose of a deformable and potentially elastic object from 2D motion.

Simultaneous Pose and Non-Rigid Shape With Particle Dynamics

no code implementations CVPR 2015 Antonio Agudo, Francesc Moreno-Noguer

In this paper, we propose a sequential solution to simultaneously estimate camera pose and non-rigid 3D shape from a monocular video.

TED: A Tolerant Edit Distance for Segmentation Evaluation

1 code implementation8 Mar 2015 Jan Funke, Francesc Moreno-Noguer, Albert Cardona, Matthew Cook

This measure, which we call Tolerant Edit Distance (TED), is motivated by two observations: (1) Some errors, like small boundary shifts, are tolerable in practice.

Neuroaesthetics in Fashion: Modeling the Perception of Fashionability

no code implementations Conference 2015 Edgar Simo-Serra, Sanja Fidler, Francesc Moreno-Noguer, Raquel Urtasun

Importantly, our model is able to give rich feedback back to the user, conveying which garments or even scenery she/he should change in order to improve fashionability.

Fracking Deep Convolutional Image Descriptors

no code implementations19 Dec 2014 Edgar Simo-Serra, Eduard Trulls, Luis Ferraz, Iasonas Kokkinos, Francesc Moreno-Noguer

In this paper we propose a novel framework for learning local image descriptors in a discriminative manner.

Segmentation-aware Deformable Part Models

no code implementations CVPR 2014 Eduard Trulls, Stavros Tsogkas, Iasonas Kokkinos, Alberto Sanfeliu, Francesc Moreno-Noguer

In this work we propose a technique to combine bottom-up segmentation, coming in the form of SLIC superpixels, with sliding window detectors, such as Deformable Part Models (DPMs).

Optical Flow Estimation Superpixels

Very Fast Solution to the PnP Problem with Algebraic Outlier Rejection

no code implementations CVPR 2014 Luis Ferraz, Xavier Binefa, Francesc Moreno-Noguer

Given a set of 3D-to-2D matches, we formulate pose estimation problem as a low-rank homogeneous sys- tem where the solution lies on its 1D null space.

Pose Estimation

Dense Segmentation-Aware Descriptors

no code implementations CVPR 2013 Eduard Trulls, Iasonas Kokkinos, Alberto Sanfeliu, Francesc Moreno-Noguer

In this work we exploit segmentation to construct appearance descriptors that can robustly deal with occlusion and background changes.

Motion Estimation

