no code implementations • 19 Dec 2024 • João Carreira, Dilara Gokay, Michael King, Chuhan Zhang, Ignacio Rocco, Aravindh Mahendran, Thomas Albert Keck, Joseph Heyward, Skanda Koppula, Etienne Pot, Goker Erdogan, Yana Hasson, Yi Yang, Klaus Greff, Guillaume Le Moing, Sjoerd van Steenkiste, Daniel Zoran, Drew A. Hudson, Pedro Vélez, Luisa Polanía, Luke Friedman, Chris Duvarney, Ross Goroshin, Kelsey Allen, Jacob Walker, Rishabh Kabra, Eric Aboussouan, Jennifer Sun, Thomas Kipf, Carl Doersch, Viorica Pătrăucean, Dima Damen, Pauline Luc, Mehdi S. M. Sajjadi, Andrew Zisserman
Scaling has not yet been convincingly demonstrated for pure self-supervised learning from video.
1 code implementation • CVPR 2024 • Guillaume Le Moing, Jean Ponce, Cordelia Schmid
Code, data, and videos showcasing the capabilities of our approach are available in the project webpage: https://16lemoing. github. io/dot .
1 code implementation • ICCV 2023 • Guillaume Le Moing, Jean Ponce, Cordelia Schmid
This paper presents WALDO (WArping Layer-Decomposed Objects), a novel approach to the prediction of future video frames from past ones.
1 code implementation • NeurIPS 2021 • Guillaume Le Moing, Jean Ponce, Cordelia Schmid
The prediction model is doubly autoregressive, in the latent space of an autoencoder for forecasting, and in image space for updating contextual information, which is also used to enforce spatio-temporal consistency through a learnable optical flow module.
Ranked #8 on
Video Generation
on BAIR Robot Pushing
1 code implementation • CVPR 2021 • Guillaume Le Moing, Tuan-Hung Vu, Himalaya Jain, Patrick Pérez, Matthieu Cord
Despite the recent progress of generative adversarial networks (GANs) at synthesizing photo-realistic images, producing complex urban scenes remains a challenging problem.
no code implementations • 10 Dec 2020 • Guillaume Le Moing, Don Joven Agravante, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Phongtharin Vinayavekhin
This paper introduces an ensemble of discriminators that improves the accuracy of a domain adaptation technique for the localization of multiple sound sources.
no code implementations • 10 Dec 2020 • Guillaume Le Moing, Phongtharin Vinayavekhin, Don Joven Agravante, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana
Moreover, learning for different microphone array layouts makes the task more complicated due to the infinite number of possible layouts.
no code implementations • 10 Dec 2020 • Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante
In this paper, we propose novel deep learning based algorithms for multiple sound source localization.