no code implementations • 19 Dec 2024 • João Carreira, Dilara Gokay, Michael King, Chuhan Zhang, Ignacio Rocco, Aravindh Mahendran, Thomas Albert Keck, Joseph Heyward, Skanda Koppula, Etienne Pot, Goker Erdogan, Yana Hasson, Yi Yang, Klaus Greff, Guillaume Le Moing, Sjoerd van Steenkiste, Daniel Zoran, Drew A. Hudson, Pedro Vélez, Luisa Polanía, Luke Friedman, Chris Duvarney, Ross Goroshin, Kelsey Allen, Jacob Walker, Rishabh Kabra, Eric Aboussouan, Jennifer Sun, Thomas Kipf, Carl Doersch, Viorica Pătrăucean, Dima Damen, Pauline Luc, Mehdi S. M. Sajjadi, Andrew Zisserman
Scaling has not yet been convincingly demonstrated for pure self-supervised learning from video.
2 code implementations • 8 Jul 2024 • Skanda Koppula, Ignacio Rocco, Yi Yang, Joe Heyward, João Carreira, Andrew Zisserman, Gabriel Brostow, Carl Doersch
We introduce a new benchmark, TAPVid-3D, for evaluating the task of long-range Tracking Any Point in 3D (TAP-3D).
2 code implementations • 1 Feb 2024 • Carl Doersch, Pauline Luc, Yi Yang, Dilara Gokay, Skanda Koppula, Ankush Gupta, Joseph Heyward, Ignacio Rocco, Ross Goroshin, João Carreira, Andrew Zisserman
To endow models with greater understanding of physics and motion, it is useful to enable them to perceive how solid surfaces move and deform in real scenes.
Ranked #1 on Point Tracking on TAP-Vid-RGB-Stacking
1 code implementation • ICCV 2023 • Roman Shapovalov, Yanir Kleiman, Ignacio Rocco, David Novotny, Andrea Vedaldi, Changan Chen, Filippos Kokkinos, Ben Graham, Natalia Neverova
We introduce Replay, a collection of multi-view, multi-modal videos of humans interacting socially.
1 code implementation • 14 Jul 2023 • Nikita Karaev, Ignacio Rocco, Benjamin Graham, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht
We introduce CoTracker, a transformer-based model that tracks a large number of 2D points in long video sequences.
Ranked #2 on Point Tracking on TAP-Vid-Kinetics-First
1 code implementation • CVPR 2023 • Nikita Karaev, Ignacio Rocco, Benjamin Graham, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht
The network learns to pool information from neighboring frames to improve the temporal consistency of its predictions.
1 code implementation • 21 Mar 2023 • Ignacio Rocco, Iurii Makarov, Filippos Kokkinos, David Novotny, Benjamin Graham, Natalia Neverova, Andrea Vedaldi
We present a method for fast 3D reconstruction and real-time rendering of dynamic humans from monocular videos with accompanying parametric body fits.
1 code implementation • 6 Dec 2022 • Mohamed El Banani, Ignacio Rocco, David Novotny, Andrea Vedaldi, Natalia Neverova, Justin Johnson, Benjamin Graham
To address this, we propose a self-supervised approach for correspondence estimation that learns from multiview consistency in short RGB-D video sequences.
no code implementations • CVPR 2023 • Samarth Sinha, Roman Shapovalov, Jeremy Reizenstein, Ignacio Rocco, Natalia Neverova, Andrea Vedaldi, David Novotny
Obtaining photorealistic reconstructions of objects from sparse views is inherently ambiguous and can only be achieved by learning suitable reconstruction priors.
no code implementations • CVPR 2022 • Anastasia Ianina, Nikolaos Sarafianos, Yuanlu Xu, Ignacio Rocco, Tony Tung
Dense correspondence between humans carries powerful semantic information that can be utilized to solve fundamental problems for full-body understanding such as in-the-wild surface matching, tracking and reconstruction.
no code implementations • CVPR 2022 • David Novotny, Ignacio Rocco, Samarth Sinha, Alexandre Carlier, Gael Kerchenbaum, Roman Shapovalov, Nikita Smetanin, Natalia Neverova, Benjamin Graham, Andrea Vedaldi
Compared to weaker deformation models, this significantly reduces the reconstruction ambiguity and, for dynamic objects, allows Keypoint Transporter to obtain reconstructions of the quality superior or at least comparable to prior approaches while being much faster and reliant on a pre-trained monocular depth estimator network.
no code implementations • 9 Sep 2021 • Dimitri Zhukov, Ignacio Rocco, Ivan Laptev, Josef Sivic, Johannes L. Schönberger, Bugra Tekin, Marc Pollefeys
Contrary to the standard scenario of instance-level 3D reconstruction, where identical objects or scenes are present in all views, objects in different instructional videos may have large appearance variations given varying conditions and versions of the same product.
1 code implementation • ECCV 2020 • Ignacio Rocco, Relja Arandjelović, Josef Sivic
In this work we target the problem of estimating accurately localised correspondences between a pair of images.
no code implementations • ICCV 2019 • Hajime Taira, Ignacio Rocco, Jiri Sedlar, Masatoshi Okutomi, Josef Sivic, Tomas Pajdla, Torsten Sattler, Akihiko Torii
The pose with the largest geometric consistency with the query image, e. g., in the form of an inlier count, is then selected in a second stage.
4 code implementations • 9 May 2019 • Mihai Dusmanu, Ignacio Rocco, Tomas Pajdla, Marc Pollefeys, Josef Sivic, Akihiko Torii, Torsten Sattler
In this work we address the problem of finding reliable pixel-level correspondences under difficult imaging conditions.
Ranked #8 on Image Matching on IMC PhotoTourism
3 code implementations • NeurIPS 2018 • Ignacio Rocco, Mircea Cimpoi, Relja Arandjelović, Akihiko Torii, Tomas Pajdla, Josef Sivic
Second, we demonstrate that the model can be trained effectively from weak supervision in the form of matching and non-matching image pairs without the need for costly manual annotation of point to point correspondences.
Ranked #2 on Semantic correspondence on PF-PASCAL (PCK (weak) metric)
2 code implementations • CVPR 2018 • Ignacio Rocco, Relja Arandjelović, Josef Sivic
We tackle the task of semantic alignment where the goal is to compute dense semantic correspondence aligning two images depicting objects of the same category.
5 code implementations • CVPR 2017 • Ignacio Rocco, Relja Arandjelović, Josef Sivic
We address the problem of determining correspondences between two images in agreement with a geometric model such as an affine or thin-plate spline transformation, and estimating its parameters.