Search Results for author: João Carreira

Found 14 papers, 5 papers with code

Perceiver IO: A General Architecture for Structured Inputs & Outputs

5 code implementations30 Jul 2021 Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira

The recently-proposed Perceiver model obtains good results on several domains (images, audio, multimodal, point clouds) while scaling linearly in compute and memory with the input size.

 Ranked #1 on Optical Flow Estimation on KITTI 2015 (Average End-Point Error metric)

Optical Flow Estimation Starcraft +2

A Short Note on the Kinetics-700-2020 Human Action Dataset

no code implementations21 Oct 2020 Lucas Smaira, João Carreira, Eric Noland, Ellen Clancy, Amy Wu, Andrew Zisserman

We describe the 2020 edition of the DeepMind Kinetics human action dataset, which replenishes and extends the Kinetics-700 dataset.

The AVA-Kinetics Localized Human Actions Video Dataset

no code implementations1 May 2020 Ang Li, Meghana Thotakuri, David A. Ross, João Carreira, Alexander Vostrikov, Andrew Zisserman

The dataset is collected by annotating videos from the Kinetics-700 dataset using the AVA annotation protocol, and extending the original AVA dataset with these new AVA annotated Kinetics clips.

Action Classification

Visual Grounding in Video for Unsupervised Word Translation

1 code implementation CVPR 2020 Gunnar A. Sigurdsson, Jean-Baptiste Alayrac, Aida Nematzadeh, Lucas Smaira, Mateusz Malinowski, João Carreira, Phil Blunsom, Andrew Zisserman

Given this shared embedding we demonstrate that (i) we can map words between the languages, particularly the 'visual' words; (ii) that the shared embedding provides a good initialization for existing unsupervised text-based word translation techniques, forming the basis for our proposed hybrid visual-text mapping algorithm, MUVE; and (iii) our approach achieves superior performance by addressing the shortcomings of text-based methods -- it is more robust, handles datasets with less commonality, and is applicable to low-resource languages.

Translation Visual Grounding

Controllable Attention for Structured Layered Video Decomposition

no code implementations ICCV 2019 Jean-Baptiste Alayrac, João Carreira, Relja Arandjelović, Andrew Zisserman

The objective of this paper is to be able to separate a video into its natural layers, and to control which of the separated layers to attend to.

Action Recognition Reflection Removal

The Visual Centrifuge: Model-Free Layered Video Representations

1 code implementation CVPR 2019 Jean-Baptiste Alayrac, João Carreira, Andrew Zisserman

True video understanding requires making sense of non-lambertian scenes where the color of light arriving at the camera sensor encodes information about not just the last object it collided with, but about multiple mediums -- colored windows, dirty mirrors, smoke or rain.

Color Constancy Video Understanding

Shape and Symmetry Induction for 3D Objects

no code implementations24 Nov 2015 Shubham Tulsiani, Abhishek Kar, Qi-Xing Huang, João Carreira, Jitendra Malik

Actions as simple as grasping an object or navigating around it require a rich understanding of that object's 3D shape from a given viewpoint.

General Classification Object Classification

Amodal Completion and Size Constancy in Natural Scenes

no code implementations ICCV 2015 Abhishek Kar, Shubham Tulsiani, João Carreira, Jitendra Malik

We consider the problem of enriching current object detection systems with veridical object sizes and relative depth estimates from a single image.

Object Detection Object Recognition +1

Pose Induction for Novel Object Categories

1 code implementation ICCV 2015 Shubham Tulsiani, João Carreira, Jitendra Malik

We address the task of predicting pose for objects of unannotated object categories from a small seed set of annotated object classes.

Category-Specific Object Reconstruction from a Single Image

no code implementations CVPR 2015 Abhishek Kar, Shubham Tulsiani, João Carreira, Jitendra Malik

Object reconstruction from a single image -- in the wild -- is a problem where we can make progress and get meaningful results today.

Object Detection Object Reconstruction

Cannot find the paper you are looking for? You can Submit a new open access paper.