1 code implementation • ICCV 2023 • Carl Doersch, Yi Yang, Mel Vecerik, Dilara Gokay, Ankush Gupta, Yusuf Aytar, Joao Carreira, Andrew Zisserman
We present a novel model for Tracking Any Point (TAP) that effectively tracks any queried point on any physical surface throughout a video sequence.
Ranked #1 on Visual Tracking on Kinetics
no code implementations • 6 Oct 2022 • Olivia Wiles, Joao Carreira, Iain Barr, Andrew Zisserman, Mateusz Malinowski
In this work, we propose a framework enabling research on hour-long videos with the same hardware that can now process second-long videos.
2 code implementations • 22 Feb 2022 • Joao Carreira, Skanda Koppula, Daniel Zoran, Adria Recasens, Catalin Ionescu, Olivier Henaff, Evan Shelhamer, Relja Arandjelovic, Matt Botvinick, Oriol Vinyals, Karen Simonyan, Andrew Zisserman, Andrew Jaegle
This however hinders them from scaling up to the inputs sizes required to process raw high-resolution images or video.
no code implementations • CVPR 2021 • Mateusz Malinowski, Dimitrios Vytiniotis, Grzegorz Swirszcz, Viorica Patraucean, Joao Carreira
How can neural networks be trained on large-volume temporal data efficiently?
10 code implementations • 4 Mar 2021 • Andrew Jaegle, Felix Gimeno, Andrew Brock, Andrew Zisserman, Oriol Vinyals, Joao Carreira
The perception models used in deep learning on the other hand are designed for individual modalities, often relying on domain-specific assumptions such as the local grid structures exploited by virtually all existing vision models.
Ranked #29 on Audio Classification on AudioSet
no code implementations • CVPR 2020 • Mateusz Malinowski, Grzegorz Swirszcz, Joao Carreira, Viorica Patraucean
We propose Sideways, an approximate backpropagation scheme for training video models.
no code implementations • 15 Jul 2019 • Joao Carreira, Eric Noland, Chloe Hillier, Andrew Zisserman
We describe an extension of the DeepMind Kinetics human action dataset from 600 classes to 700 classes, where for each class there are at least 600 video clips from different YouTube videos.
no code implementations • 9 Feb 2019 • Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-Che Tsai, Anurag Khandelwal, Qifan Pu, Vaishaal Shankar, Joao Carreira, Karl Krauth, Neeraja Yadwadkar, Joseph E. Gonzalez, Raluca Ada Popa, Ion Stoica, David A. Patterson
Serverless cloud computing handles virtually all the system administration operations needed to make it easier for programmers to use the cloud.
Operating Systems
1 code implementation • 3 Aug 2018 • Joao Carreira, Eric Noland, Andras Banki-Horvath, Chloe Hillier, Andrew Zisserman
We describe an extension of the DeepMind Kinetics human action dataset from 400 classes, each with at least 400 video clips, to 600 classes, each with at least 600 video clips.
Ranked #61 on Action Classification on Kinetics-600
no code implementations • ECCV 2018 • Joao Carreira, Viorica Patraucean, Laurent Mazare, Andrew Zisserman, Simon Osindero
We introduce a class of causal video understanding models that aims to improve efficiency of video processing by maximising throughput, minimising latency, and reducing the number of clock cycles.
33 code implementations • CVPR 2017 • Joao Carreira, Andrew Zisserman
The paucity of videos in current action classification datasets (UCF-101 and HMDB-51) has made it difficult to identify good video architectures, as most methods obtain similar performance on existing small-scale benchmarks.
12 code implementations • 19 May 2017 • Will Kay, Joao Carreira, Karen Simonyan, Brian Zhang, Chloe Hillier, Sudheendra Vijayanarasimhan, Fabio Viola, Tim Green, Trevor Back, Paul Natsev, Mustafa Suleyman, Andrew Zisserman
We describe the DeepMind Kinetics human action video dataset.
1 code implementation • CVPR 2016 • Joao Carreira, Pulkit Agrawal, Katerina Fragkiadaki, Jitendra Malik
Hierarchical feature extractors such as Convolutional Networks (ConvNets) have achieved impressive performance on a variety of classification tasks using purely feedforward processing.
Ranked #43 on Pose Estimation on MPII Human Pose
no code implementations • ICCV 2015 • Pulkit Agrawal, Joao Carreira, Jitendra Malik
We show that given the same number of training images, features learnt using egomotion as supervision compare favourably to features learnt using class-label as supervision on visual tasks of scene recognition, object recognition, visual odometry and keypoint matching.
no code implementations • 22 Mar 2015 • Joao Carreira, Sara Vicente, Lourdes Agapito, Jorge Batista
In particular, acquiring ground truth 3D shapes of objects pictured in 2D images remains a challenging feat and this has hampered progress in recognition-based object reconstruction from a single image.
no code implementations • CVPR 2014 • Catalin Ionescu, Joao Carreira, Cristian Sminchisescu
Recently, the emergence of Kinect systems has demonstrated the benefits of predicting an intermediate body part labeling for 3D human pose estimation, in conjunction with RGB-D imagery.
no code implementations • CVPR 2014 • Sara Vicente, Joao Carreira, Lourdes Agapito, Jorge Batista
We address the problem of populating object category detection datasets with dense, per-object 3D reconstructions, bootstrapped from class labels, ground truth figure-ground segmentations and a small set of keypoint annotations.
no code implementations • CVPR 2013 • Fuxin Li, Joao Carreira, Guy Lebanon, Cristian Sminchisescu
In this paper we present an inference procedure for the semantic segmentation of images.
no code implementations • NeurIPS 2011 • Adrian Ion, Joao Carreira, Cristian Sminchisescu
We present a joint image segmentation and labeling model (JSL) which, given a bag of figure-ground segment hypotheses extracted at multiple image locations and scales, constructs a joint probability distribution over both the compatible image interpretations (tilings or image segmentations) composed from those segments, and over their labeling into categories.