Search Results for author: Adrià Recasens

Found 6 papers, 4 papers with code

Self-Supervised MultiModal Versatile Networks

1 code implementation NeurIPS 2020 Jean-Baptiste Alayrac, Adrià Recasens, Rosalia Schneider, Relja Arandjelović, Jason Ramapuram, Jeffrey De Fauw, Lucas Smaira, Sander Dieleman, Andrew Zisserman

In particular, we explore how best to combine the modalities, such that fine-grained representations of the visual and audio modalities can be maintained, whilst also integrating text into a common embedding.

Action Recognition In Videos Audio Classification +2

Learning to Zoom: a Saliency-Based Sampling Layer for Neural Networks

1 code implementation ECCV 2018 Adrià Recasens, Petr Kellnhofer, Simon Stent, Wojciech Matusik, Antonio Torralba

We introduce a saliency-based distortion layer for convolutional neural networks that helps to improve the spatial sampling of input data for a given task.

Caricature Gaze Estimation +2

Synthetically Trained Icon Proposals for Parsing and Summarizing Infographics

1 code implementation27 Jul 2018 Spandan Madan, Zoya Bylinskii, Matthew Tancik, Adrià Recasens, Kimberli Zhong, Sami Alsheikh, Hanspeter Pfister, Aude Oliva, Fredo Durand

While automatic text extraction works well on infographics, computer vision approaches trained on natural images fail to identify the stand-alone visual elements in infographics, or `icons'.

14 Synthetic Data Generation

Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input

no code implementations ECCV 2018 David Harwath, Adrià Recasens, Dídac Surís, Galen Chuang, Antonio Torralba, James Glass

In this paper, we explore neural network models that learn to associate segments of spoken audio captions with the semantically relevant portions of natural images that they refer to.

Following Gaze Across Views

no code implementations9 Dec 2016 Adrià Recasens, Carl Vondrick, Aditya Khosla, Antonio Torralba

In this paper, we present an approach for following gaze across views by predicting where a particular person is looking throughout a scene.

Cannot find the paper you are looking for? You can Submit a new open access paper.