no code implementations • 12 Feb 2025 • Pierre-Emmanuel Mazaré, Gergely Szilvasy, Maria Lomeli, Francisco Massa, Naila Murray, Hervé Jégou, Matthijs Douze
Self-attention in transformer models is an incremental associative memory that maps key vectors to value vectors.
1 code implementation • 1 Nov 2024 • Ruisi Zhang, Tianyu Liu, Will Feng, Andrew Gu, Sanket Purandare, Wanchao Liang, Francisco Massa
Distributed training of large models consumes enormous computation resources and requires substantial engineering efforts to compose various training techniques.
22 code implementations • 14 Apr 2023 • Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Hervé Jegou, Julien Mairal, Patrick Labatut, Armand Joulin, Piotr Bojanowski
The recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar foundation models in computer vision.
Ranked #1 on
Image Retrieval
on AmsterTime
(using extra training data)
2 code implementations • 15 Nov 2022 • Simon Rouard, Francisco Massa, Alexandre Défossez
While it performs poorly when trained only on MUSDB, we show that it outperforms Hybrid Demucs (trained on the same data) by 0. 45 dB of SDR when using 800 extra training songs.
Ranked #1 on
Music Source Separation
on MUSDB18
(using extra training data)
36 code implementations • 23 Dec 2020 • Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou
In this work, we produce a competitive convolution-free transformer by training on Imagenet only.
Ranked #4 on
Efficient ViTs
on ImageNet-1K (with DeiT-S)
38 code implementations • ECCV 2020 • Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko
We present a new method that views object detection as a direct set prediction problem.
Ranked #22 on
Panoptic Segmentation
on COCO minival
3 code implementations • NeurIPS 2019 • Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, Soumith Chintala
Deep learning frameworks have often focused on either usability or speed, but not both.
4 code implementations • 6 Nov 2019 • Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee, Jeffery Liao, Anton Lokhmotov, Francisco Massa, Peng Meng, Paulius Micikevicius, Colin Osborne, Gennady Pekhimenko, Arun Tejusve Raghunath Rajan, Dilip Sequeira, Ashish Sirasao, Fei Sun, Hanlin Tang, Michael Thomson, Frank Wei, Ephrem Wu, Lingjie Xu, Koichi Yamada, Bing Yu, George Yuan, Aaron Zhong, Peizhao Zhang, Yuchen Zhou
Machine-learning (ML) hardware and software system demand is burgeoning.
no code implementations • 16 Nov 2017 • Joost van Amersfoort, Wenzhe Shi, Alejandro Acosta, Francisco Massa, Johannes Totz, Zehan Wang, Jose Caballero
To improve the quality of synthesised intermediate video frames, our network is jointly supervised at different levels with a perceptual loss function that consists of an adversarial and two content losses.
no code implementations • 13 Sep 2016 • Francisco Massa, Renaud Marlet, Mathieu Aubry
Convolutional Neural Networks (CNNs) were recently shown to provide state-of-the-art results for object category viewpoint estimation.
no code implementations • CVPR 2016 • Francisco Massa, Bryan Russell, Mathieu Aubry
This paper presents an end-to-end convolutional neural network (CNN) for 2D-3D exemplar detection.
no code implementations • 22 Dec 2014 • Francisco Massa, Mathieu Aubry, Renaud Marlet
In this paper we study the application of convolutional neural networks for jointly detecting objects depicted in still images and estimating their 3D pose.