no code implementations • CVPR 2014 • Hossein Mobahi, Ce Liu, William T. Freeman
Learning a low-dimensional representation of images is useful for various applications in graphics and computer vision.
no code implementations • CVPR 2015 • Tianfan Xue, Hossein Mobahi, Fredo Durand, William T. Freeman
We pose and solve a generalization of the aperture problem for moving refractive elements.
no code implementations • NeurIPS 2015 • Charlie Frogner, Chiyuan Zhang, Hossein Mobahi, Mauricio Araya-Polo, Tomaso Poggio
In this paper we develop a loss function for multi-label learning, based on the Wasserstein distance.
no code implementations • 16 Jan 2016 • Hossein Mobahi
This work presents a new algorithm for training recurrent neural networks (although ideas are applicable to feedforward networks as well).
no code implementations • 19 Jan 2016 • Hossein Mobahi, Stefano Soatto
Can it suggest new algorithms with reduced computational complexity or new descriptors with better accuracy for matching?
no code implementations • 28 Oct 2016 • Anima Anandkumar, Yuan Deng, Rong Ge, Hossein Mobahi
For the challenging problem of tensor PCA, we prove global convergence of the homotopy method in the "high noise" regime.
2 code implementations • NeurIPS 2018 • Gamaleldin F. Elsayed, Dilip Krishnan, Hossein Mobahi, Kevin Regan, Samy Bengio
We present a formulation of deep learning that aims at producing a large margin classifier.
2 code implementations • ICLR 2019 • Yiding Jiang, Dilip Krishnan, Hossein Mobahi, Samy Bengio
In this paper, we propose such a measure, and conduct extensive empirical studies on how well it can predict the generalization gap.
no code implementations • 29 Jan 2019 • Vighnesh Birodkar, Hossein Mobahi, Samy Bengio
Large datasets have been crucial to the success of deep learning models in the recent years, which keep performing better as they are trained with more labelled data.
no code implementations • 10 Jun 2019 • Vighnesh Birodkar, Hossein Mobahi, Dilip Krishnan, Samy Bengio
This operator can learn a strict super-set of what can be learned by average pooling or convolutions.
3 code implementations • ICLR 2020 • Yiding Jiang, Behnam Neyshabur, Hossein Mobahi, Dilip Krishnan, Samy Bengio
We present the first large scale study of generalization in deep networks.
no code implementations • NeurIPS 2020 • Hossein Mobahi, Mehrdad Farajtabar, Peter L. Bartlett
Knowledge distillation introduced in the deep learning context is a method to transfer knowledge from one architecture to another.
14 code implementations • ICLR 2021 • Pierre Foret, Ariel Kleiner, Hossein Mobahi, Behnam Neyshabur
In today's heavily overparameterized models, the value of the training loss provides few guarantees on model generalization ability.
Ranked #1 on Image Classification on CIFAR-100 (using extra training data)
Fine-Grained Image Classification Learning with noisy labels
no code implementations • ICLR 2021 • Chulhee Yun, Shankar Krishnan, Hossein Mobahi
For $L$-layer linear tensor networks that are orthogonally decomposable, we show that gradient flow on separable classification finds a stationary point of the $\ell_{2/L}$ max-margin problem in a "transformed" input space defined by the network.
no code implementations • 5 Nov 2020 • Calvin Luo, Hossein Mobahi, Samy Bengio
The advantage of adversarial augmentation is that it replaces sampling with the use of a single, calculated perturbation that maximally increases the loss.
no code implementations • 14 Dec 2020 • Yiding Jiang, Pierre Foret, Scott Yak, Daniel M. Roy, Hossein Mobahi, Gintare Karolina Dziugaite, Samy Bengio, Suriya Gunasekar, Isabelle Guyon, Behnam Neyshabur
Understanding generalization in deep learning is arguably one of the most important questions in deep learning.
1 code implementation • 18 Mar 2021 • Minyoung Huh, Hossein Mobahi, Richard Zhang, Brian Cheung, Pulkit Agrawal, Phillip Isola
We show empirically that our claim holds true on finite width linear and non-linear models on practical learning paradigms and show that on natural data, these are often the solutions that generalize well.
no code implementations • 29 Sep 2021 • Yaodong Yu, Heinrich Jiang, Dara Bahri, Hossein Mobahi, Seungyeon Kim, Ankit Singh Rawat, Andreas Veit, Yi Ma
Concretely, we show that larger models and larger datasets need to be simultaneously leveraged to improve OOD performance.
no code implementations • ACL 2022 • Dara Bahri, Hossein Mobahi, Yi Tay
The allure of superhuman-level capabilities has led to considerable interest in language models like GPT-3 and T5, wherein the research has, by and large, revolved around new model architectures, training tasks, and loss objectives, along with substantial engineering efforts to scale up model capacity and dataset size.
no code implementations • NeurIPS 2023 • Vaishnavh Nagarajan, Aditya Krishna Menon, Srinadh Bhojanapalli, Hossein Mobahi, Sanjiv Kumar
Knowledge distillation (KD) has been widely used to improve the test accuracy of a "student" network, by training it to mimic the soft probabilities of a trained "teacher" network.
no code implementations • 24 Oct 2023 • Katherine L. Hermann, Hossein Mobahi, Thomas Fel, Michael C. Mozer
Deep-learning models can extract a rich assortment of features from data.
no code implementations • 19 Jan 2024 • Yann N. Dauphin, Atish Agarwala, Hossein Mobahi
We find that regularizing feature exploitation but not feature exploration yields performance similar to gradient penalties.