Search Results for author: Xavier Giro-i-Nieto

Found 47 papers, 34 papers with code

Adversarial Learning for Feature Shift Detection and Correction

1 code implementation NeurIPS 2023 Miriam Barrabes, Daniel Mas Montserrat, Margarita Geleta, Xavier Giro-i-Nieto, Alexander G. Ioannidis

Data shift is a phenomenon present in many real-world applications, and while there are multiple methods attempting to detect shifts, the task of localizing and correcting the features originating such shifts has not been studied in depth.

Towards Robust Image-in-Audio Deep Steganography

1 code implementation9 Mar 2023 Jaume Ros, Margarita Geleta, Jordi Pons, Xavier Giro-i-Nieto

The field of steganography has experienced a surge of interest due to the recent advancements in AI-powered techniques, particularly in the context of multimodal setups that enable the concealment of signals within signals of a different nature.

Image Reconstruction

SIRA: Relightable Avatars from a Single Image

no code implementations7 Sep 2022 Pol Caselles, Eduard Ramon, Jaime Garcia, Xavier Giro-i-Nieto, Francesc Moreno-Noguer, Gil Triginer

Our key ingredients are two data-driven statistical models based on neural fields that resolve the ambiguities of single-view 3D surface reconstruction and appearance factorization.

Surface Reconstruction

Topic Detection in Continuous Sign Language Videos

1 code implementation1 Sep 2022 Alvaro Budria, Laia Tarres, Gerard I. Gallego, Francesc Moreno-Noguer, Jordi Torres, Xavier Giro-i-Nieto

Significant progress has been made recently on challenging tasks in automatic sign language understanding, such as sign language recognition, translation and production.

Sign Language Recognition Translation

H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

1 code implementation ICCV 2021 Eduard Ramon, Gil Triginer, Janna Escur, Albert Pumarola, Jaime Garcia, Xavier Giro-i-Nieto, Francesc Moreno-Noguer

In this paper, we tackle these limitations for the specific problem of few-shot full 3D head reconstruction, by endowing coordinate-based representations with a probabilistic shape prior that enables faster convergence and better generalization when using few input images (down to three).

3D Reconstruction Multi-View 3D Reconstruction +1

Can Everybody Sign Now? Exploring Sign Language Video Generation from 2D Poses

no code implementations20 Dec 2020 Lucas Ventura, Amanda Duarte, Xavier Giro-i-Nieto

Recent work have addressed the generation of human poses represented by 2D/3D coordinates of human joints for sign language.

Sign Language Production Video Generation

RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation

2 code implementations1 Oct 2020 Miriam Bellver, Carles Ventura, Carina Silberer, Ioannis Kazakos, Jordi Torres, Xavier Giro-i-Nieto

The task of video object segmentation with referring expressions (language-guided VOS) is to, given a linguistic phrase and a video, generate binary masks for the object to which the phrase refers.

Image Segmentation Referring Expression Segmentation +2

Mask-guided sample selection for Semi-Supervised Instance Segmentation

no code implementations25 Aug 2020 Miriam Bellver, Amaia Salvador, Jordi Torres, Xavier Giro-i-Nieto

Our method consists in first predicting pseudo-masks for the unlabeled pool of samples, together with a score predicting the quality of the mask.

Active Learning Image Segmentation +4

How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language

1 code implementation CVPR 2021 Amanda Duarte, Shruti Palaskar, Lucas Ventura, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, Xavier Giro-i-Nieto

Towards this end, we introduce How2Sign, a multimodal and multiview continuous American Sign Language (ASL) dataset, consisting of a parallel corpus of more than 80 hours of sign language videos and a set of corresponding modalities including speech, English transcripts, and depth.

Sign Language Production Sign Language Translation +1

Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and Videos

no code implementations1 Jun 2020 Benet Oriol, Jordi Luque, Ferran Diego, Xavier Giro-i-Nieto

In this work, we propose an effective approach for training unique embedding representations by combining three simultaneous modalities: image and spoken and textual narratives.

Retrieval

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills

1 code implementation ICML 2020 Víctor Campos, Alexander Trott, Caiming Xiong, Richard Socher, Xavier Giro-i-Nieto, Jordi Torres

We perform an extensive evaluation of skill discovery methods on controlled environments and show that EDL offers significant advantages, such as overcoming the coverage problem, reducing the dependence of learned skills on the initial state, and allowing the user to define a prior over which behaviors should be learned.

Automatic Reminiscence Therapy for Dementia

1 code implementation25 Oct 2019 Mariona Caros, Maite Garolera, Petia Radeva, Xavier Giro-i-Nieto

With people living longer than ever, the number of cases with dementia such as Alzheimer's disease increases steadily.

Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation

1 code implementation5 Oct 2019 Benet Oriol Sabat, Cristian Canton Ferrer, Xavier Giro-i-Nieto

This work addresses the challenge of hate speech detection in Internet memes, and attempts using visual information to automatically detect hate speech, unlike any previous work of our knowledge.

Hate Speech Detection

Simple vs complex temporal recurrences for video saliency prediction

2 code implementations3 Jul 2019 Panagiotis Linardos, Eva Mohedano, Juan Jose Nieto, Noel E. O'Connor, Xavier Giro-i-Nieto, Kevin McGuinness

This paper investigates modifying an existing neural network architecture for static saliency prediction using two types of recurrences that integrate information from the temporal domain.

Saliency Prediction Video Saliency Detection +1

Budget-aware Semi-Supervised Semantic and Instance Segmentation

no code implementations14 May 2019 Miriam Bellver, Amaia Salvador, Jordi Torres, Xavier Giro-i-Nieto

Methods that move towards less supervised scenarios are key for image segmentation, as dense labels demand significant human intervention.

Image Segmentation Instance Segmentation +2

RVOS: End-to-End Recurrent Network for Video Object Segmentation

1 code implementation CVPR 2019 Carles Ventura, Miriam Bellver, Andreu Girbau, Amaia Salvador, Ferran Marques, Xavier Giro-i-Nieto

Multiple object video object segmentation is a challenging task, specially for the zero-shot case, when no object mask is given at the initial frame and the model has to find the objects to be segmented along the sequence.

Object One-shot visual object segmentation +3

Inverse Cooking: Recipe Generation from Food Images

4 code implementations CVPR 2019 Amaia Salvador, Michal Drozdzal, Xavier Giro-i-Nieto, Adriana Romero

Our system predicts ingredients as sets by means of a novel architecture, modeling their dependencies without imposing any order, and then generates cooking instructions by attending to both image and its inferred ingredients simultaneously.

Recipe Generation Retrieval

Importance Weighted Evolution Strategies

no code implementations12 Nov 2018 Víctor Campos, Xavier Giro-i-Nieto, Jordi Torres

Evolution Strategies (ES) emerged as a scalable alternative to popular Reinforcement Learning (RL) techniques, providing an almost perfect speedup when distributed across hundreds of CPU cores thanks to a reduced communication overhead.

reinforcement-learning Reinforcement Learning (RL)

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

1 code implementation3 Sep 2018 Marc Assens, Xavier Giro-i-Nieto, Kevin McGuinness, Noel E. O'Connor

We introduce PathGAN, a deep neural network for visual scanpath prediction trained on adversarial examples.

Scanpath prediction

Temporal Saliency Adaptation in Egocentric Videos

2 code implementations28 Aug 2018 Panagiotis Linardos, Eva Mohedano, Monica Cherto, Cathal Gurrin, Xavier Giro-i-Nieto

This work adapts a deep neural model for image saliency prediction to the temporal domain of egocentric video.

Saliency Prediction Video Saliency Prediction

Comparing Fixed and Adaptive Computation Time for Recurrent Neural Networks

no code implementations21 Mar 2018 Daniel Fojo, Víctor Campos, Xavier Giro-i-Nieto

Adaptive Computation Time for Recurrent Neural Networks (ACT) is one of the most promising architectures for variable computation.

Recurrent Neural Networks for Semantic Instance Segmentation

1 code implementation2 Dec 2017 Amaia Salvador, Miriam Bellver, Victor Campos, Manel Baradad, Ferran Marques, Jordi Torres, Xavier Giro-i-Nieto

We present a recurrent model for semantic instance segmentation that sequentially generates binary masks and their associated class probabilities for every object in an image.

Instance Segmentation Object +2

Detection-aided liver lesion segmentation using deep learning

2 code implementations29 Nov 2017 Miriam Bellver, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Xavier Giro-i-Nieto, Jordi Torres, Luc van Gool

A fully automatic technique for segmenting the liver and localizing its unhealthy tissues is a convenient tool in order to diagnose hepatic diseases and assess the response to the according treatments.

Computed Tomography (CT) Lesion Segmentation +1

Saliency Weighted Convolutional Features for Instance Search

1 code implementation29 Nov 2017 Eva Mohedano, Kevin McGuinness, Xavier Giro-i-Nieto, Noel E. O'Connor

This work explores attention models to weight the contribution of local convolutional representations for the instance search task.

Instance Search Retrieval

Cost-Effective Active Learning for Melanoma Segmentation

2 code implementations24 Nov 2017 Marc Gorriz, Axel Carlier, Emmanuel Faure, Xavier Giro-i-Nieto

We propose a novel Active Learning framework capable to train effectively a convolutional neural network for semantic segmentation of medical imaging, with a limited amount of training labeled data.

Active Learning Image Segmentation +3

Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks

3 code implementations ICLR 2018 Victor Campos, Brendan Jou, Xavier Giro-i-Nieto, Jordi Torres, Shih-Fu Chang

We introduce the Skip RNN model which extends existing RNN models by learning to skip state updates and shortens the effective size of the computational graph.

More cat than cute? Interpretable Prediction of Adjective-Noun Pairs

1 code implementation21 Aug 2017 Delia Fernandez, Alejandro Woodward, Victor Campos, Xavier Giro-i-Nieto, Brendan Jou, Shih-Fu Chang

This work aims at disentangling the contributions of the `adjectives' and `nouns' in the visual prediction of ANPs.

Disentangling Motion, Foreground and Background Features in Videos

1 code implementation13 Jul 2017 Xunyu Lin, Victor Campos, Xavier Giro-i-Nieto, Jordi Torres, Cristian Canton Ferrer

This paper introduces an unsupervised framework to extract semantically rich features for video representation.

SaltiNet: Scan-path Prediction on 360 Degree Images using Saliency Volumes

1 code implementation11 Jul 2017 Marc Assens, Kevin McGuinness, Xavier Giro-i-Nieto, Noel E. O'Connor

The first part of the network consists of a model trained to generate saliency volumes, whose parameters are fit by back-propagation computed from a binary cross entropy (BCE) loss over downsampled versions of the saliency volumes.

Scanpath prediction

Class-Weighted Convolutional Features for Visual Instance Search

2 code implementations9 Jul 2017 Albert Jimenez, Jose M. Alvarez, Xavier Giro-i-Nieto

In this paper, we go beyond this spatial information and propose a local-aware encoding of convolutional features based on semantic information predicted in the target image.

Image Retrieval Instance Search +2

Hierarchical Object Detection with Deep Reinforcement Learning

1 code implementation11 Nov 2016 Miriam Bellver, Xavier Giro-i-Nieto, Ferran Marques, Jordi Torres

We argue that, while this loss seems unavoidable when working with large amounts of object candidates, the much more reduced amount of region proposals generated by our reinforcement learning agent allows considering to extract features for each location without sharing convolutional computation among regions.

Object object-detection +4

Open-Ended Visual Question-Answering

1 code implementation9 Oct 2016 Issey Masuda, Santiago Pascual de la Puente, Xavier Giro-i-Nieto

This thesis report studies methods to solve Visual Question-Answering (VQA) tasks with a Deep Learning framework.

Question Answering Sentence +3

Where is my Phone ? Personal Object Retrieval from Egocentric Images

no code implementations29 Aug 2016 Cristian Reyes, Eva Mohedano, Kevin McGuinness, Noel E. O'Connor, Xavier Giro-i-Nieto

This work presents a retrieval pipeline and evaluation scheme for the problem of finding the last appearance of personal objects in a large dataset of images captured from a wearable camera.

Retrieval

Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks

3 code implementations29 Aug 2016 Alberto Montes, Amaia Salvador, Santiago Pascual, Xavier Giro-i-Nieto

This thesis explore different approaches using Convolutional and Recurrent Neural Networks to classify and temporally localize activities on videos, furthermore an implementation to achieve it has been proposed.

Action Detection Activity Detection

Faster R-CNN Features for Instance Search

3 code implementations29 Apr 2016 Amaia Salvador, Xavier Giro-i-Nieto, Ferran Marques, Shin'ichi Satoh

This work explores the suitability for instance retrieval of image- and region-wise representations pooled from an object detection CNN such as Faster R-CNN.

Instance Search object-detection +3

Bags of Local Convolutional Features for Scalable Instance Search

2 code implementations15 Apr 2016 Eva Mohedano, Amaia Salvador, Kevin McGuinness, Ferran Marques, Noel E. O'Connor, Xavier Giro-i-Nieto

This work proposes a simple instance retrieval pipeline based on encoding the convolutional features of CNN using the bag of words aggregation scheme (BoW).

Instance Search Retrieval

From Pixels to Sentiment: Fine-tuning CNNs for Visual Sentiment Prediction

2 code implementations12 Apr 2016 Victor Campos, Brendan Jou, Xavier Giro-i-Nieto

Visual multimedia have become an inseparable part of our digital social lives, and they often capture moments tied with deep affections.

Sentiment Analysis Visual Sentiment Prediction

Shallow and Deep Convolutional Networks for Saliency Prediction

1 code implementation CVPR 2016 Junting Pan, Kevin McGuinness, Elisa Sayrol, Noel O'Connor, Xavier Giro-i-Nieto

The prediction of salient areas in images has been traditionally addressed with hand-crafted features based on neuroscience principles.

Saliency Prediction

Cultural Event Recognition with Visual ConvNets and Temporal Models

no code implementations24 Apr 2015 Amaia Salvador, Matthias Zeppelzauer, Daniel Manchon-Vizuete, Andrea Calafell, Xavier Giro-i-Nieto

Our solution is based on the combination of visual features extracted from convolutional neural networks with temporal information using a hierarchical classifier scheme.

Classification General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.