Search Results for author: Yann Dauphin

Found 25 papers, 10 papers with code

A density estimation perspective on learning from pairwise human preferences

1 code implementation • 23 Nov 2023 • Vincent Dumoulin, Daniel D. Johnson, Pablo Samuel Castro, Hugo Larochelle, Yann Dauphin

Learning from human feedback (LHF) -- and in particular learning from pairwise preferences -- has recently become a crucial ingredient in training large language models (LLMs), and has been the subject of much research.

Density Estimation

Paper
Code

Tied-Augment: Controlling Representation Similarity Improves Data Augmentation

1 code implementation • 22 May 2023 • Emirhan Kurtulus, Zichao Li, Yann Dauphin, Ekin Dogus Cubuk

For example, even the simple flips-and-crops augmentation requires training for more than 5 epochs to improve performance, whereas RandAugment requires more than 90 epochs.

Data Augmentation

Paper
Code

JaxPruner: A concise library for sparsity research

1 code implementation • 27 Apr 2023 • Joo Hyung Lee, Wonpyo Park, Nicole Mitchell, Jonathan Pilault, Johan Obando-Ceron, Han-Byul Kim, Namhoon Lee, Elias Frantar, Yun Long, Amir Yazdanbakhsh, Shivani Agrawal, Suvinay Subramanian, Xin Wang, Sheng-Chun Kao, Xingyao Zhang, Trevor Gale, Aart Bik, Woohyun Han, Milen Ferev, Zhonglin Han, Hong-Seok Kim, Yann Dauphin, Gintare Karolina Dziugaite, Pablo Samuel Castro, Utku Evci

This paper introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research.

196

Paper
Code

Robustmix: Improving Robustness by Regularizing the Frequency Bias of Deep Nets

no code implementations • 6 Apr 2023 • Jonas Ngnawe, Marianne ABEMGNIGNI NJIFON, Jonathan Heek, Yann Dauphin

Deep networks have achieved impressive results on a range of well-curated benchmark datasets.

Data Augmentation

Paper
Add Code

No One Representation to Rule Them All: Overlapping Features of Training Methods

no code implementations • ICLR 2022 • Raphael Gontijo-Lopes, Yann Dauphin, Ekin D. Cubuk

Despite being able to capture a range of features of the data, high accuracy models trained with supervision tend to make similar predictions.

Contrastive Learning

Paper
Add Code

Auxiliary Task Update Decomposition: The Good, The Bad and The Neutral

1 code implementation • ICLR 2021 • Lucio M. Dery, Yann Dauphin, David Grangier

In this case, careful consideration is needed to select tasks and model parameterizations such that updates from the auxiliary tasks actually help the primary task.

Image Classification

Paper
Code

Continental-Scale Building Detection from High Resolution Satellite Imagery

no code implementations • 26 Jul 2021 • Wojciech Sirko, Sergii Kashubin, Marvin Ritter, Abigail Annkah, Yasser Salah Eddine Bouchareb, Yann Dauphin, Daniel Keysers, Maxim Neumann, Moustapha Cisse, John Quinn

Identifying the locations and footprints of buildings is vital for many practical and scientific purposes.

Instance Segmentation Semantic Segmentation +1

Paper
Add Code

Deconstructing the Regularization of BatchNorm

no code implementations • ICLR 2021 • Yann Dauphin, Ekin Dogus Cubuk

Surprisingly, this simple mechanism matches the improvement of $0. 8\%$ of the more complex Dropout regularization for the state-of-the-art Efficientnet-B8 model on Imagenet.

Paper
Add Code

Temperature check: theory and practice for training models with softmax-cross-entropy losses

no code implementations • 14 Oct 2020 • Atish Agarwala, Jeffrey Pennington, Yann Dauphin, Sam Schoenholz

In this work we develop a theory of early learning for models trained with softmax-cross-entropy loss and show that the learning dynamics depend crucially on the inverse-temperature $\beta$ as well as the magnitude of the logits at initialization, $||\beta{\bf z}||_{2}$.

Sentiment Analysis

Paper
Add Code

Gradient Flow in Sparse Neural Networks and How Lottery Tickets Win

1 code implementation • 7 Oct 2020 • Utku Evci, Yani A. Ioannou, Cem Keskin, Yann Dauphin

Sparse Neural Networks (NNs) can match the generalization of dense NNs using a fraction of the compute/storage for inference, and also have the potential to enable efficient training.

314

Paper
Code

Robust and On-the-fly Dataset Denoising for Image Classification

no code implementations • ECCV 2020 • Jiaming Song, Lunjia Hu, Michael Auli, Yann Dauphin, Tengyu Ma

We address this problem by reasoning counterfactually about the loss distribution of examples with uniform random labels had they were trained with the real examples, and use this information to remove noisy examples from the training set.

Ranked #35 on Image Classification on mini WebVision 1.0

Classification counterfactual +4

Paper
Add Code

What Do Compressed Deep Neural Networks Forget?

2 code implementations • 13 Nov 2019 • Sara Hooker, Aaron Courville, Gregory Clark, Yann Dauphin, Andrea Frome

However, this measure of performance conceals significant differences in how different classes and images are impacted by model compression techniques.

Fairness Interpretability Techniques for Deep Learning +4

32,795

Paper
Code

Selective Brain Damage: Measuring the Disparate Impact of Model Pruning

no code implementations • 25 Sep 2019 • Sara Hooker, Yann Dauphin, Aaron Courville, Andrea Frome

Neural network pruning techniques have demonstrated it is possible to remove the majority of weights in a network with surprisingly little degradation to top-1 test set accuracy.

Network Pruning

Paper
Add Code

Better Generalization with On-the-fly Dataset Denoising

no code implementations • ICLR 2019 • Jiaming Song, Tengyu Ma, Michael Auli, Yann Dauphin

Memorization in over-parameterized neural networks can severely hurt generalization in the presence of mislabeled examples.

Denoising Memorization

Paper
Add Code

On the Pitfalls of Measuring Emergent Communication

1 code implementation • 12 Mar 2019 • Ryan Lowe, Jakob Foerster, Y-Lan Boureau, Joelle Pineau, Yann Dauphin

How do we know if communication is emerging in a multi-agent system?

Fault Detection

Paper
Code

Strategies for Structuring Story Generation

no code implementations • ACL 2019 • Angela Fan, Mike Lewis, Yann Dauphin

Writers generally rely on plans or sketches to write long stories, but most current language models generate word by word from left to right.

Story Generation

Paper
Add Code

Hierarchical Neural Story Generation

7 code implementations • ACL 2018 • Angela Fan, Mike Lewis, Yann Dauphin

We explore story generation: creative systems that can build coherent and fluent passages of text about a topic.

Story Generation

29,237

Paper
Code

Deal or No Deal? End-to-End Learning of Negotiation Dialogues

no code implementations • EMNLP 2017 • Mike Lewis, Denis Yarats, Yann Dauphin, Devi Parikh, Dhruv Batra

Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions.

Paper
Add Code

Empirical Analysis of the Hessian of Over-Parametrized Neural Networks

no code implementations • ICLR 2018 • Levent Sagun, Utku Evci, V. Ugur Guney, Yann Dauphin, Leon Bottou

In particular, we present a case that links the two observations: small and large batch gradient descent appear to converge to different basins of attraction but we show that they are in fact connected through their flat region and so belong to the same basin.

Paper
Add Code

Tackling Over-pruning in Variational Autoencoders

no code implementations • 9 Jun 2017 • Serena Yeung, Anitha Kannan, Yann Dauphin, Li Fei-Fei

The so-called epitomes of this model are groups of mutually exclusive latent factors that compete to explain the data.

Paper
Add Code

Parseval Networks: Improving Robustness to Adversarial Examples

1 code implementation • ICML 2017 • Moustapha Cisse, Piotr Bojanowski, Edouard Grave, Yann Dauphin, Nicolas Usunier

We introduce Parseval networks, a form of deep neural networks in which the Lipschitz constant of linear, convolutional and aggregation layers is constrained to be smaller than 1.

Paper
Code

EmoNets: Multimodal deep learning approaches for emotion recognition in video

no code implementations • 5 Mar 2015 • Samira Ebrahimi Kahou, Xavier Bouthillier, Pascal Lamblin, Caglar Gulcehre, Vincent Michalski, Kishore Konda, Sébastien Jean, Pierre Froumenty, Yann Dauphin, Nicolas Boulanger-Lewandowski, Raul Chandias Ferrari, Mehdi Mirza, David Warde-Farley, Aaron Courville, Pascal Vincent, Roland Memisevic, Christopher Pal, Yoshua Bengio

The task of the emotion recognition in the wild (EmotiW) Challenge is to assign one of seven emotions to short video clips extracted from Hollywood style movies.

Emotion Recognition Multimodal Deep Learning

Paper
Add Code

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

4 code implementations • NeurIPS 2014 • Yann Dauphin, Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, Surya Ganguli, Yoshua Bengio

Gradient descent or quasi-Newton methods are almost ubiquitously used to perform such minimizations, and it is often thought that a main source of difficulty for these local methods to find the global minimum is the proliferation of local minima with much higher error than the global minimum.

Paper
Code

Stochastic Ratio Matching of RBMs for Sparse High-Dimensional Inputs

no code implementations • NeurIPS 2013 • Yann Dauphin, Yoshua Bengio

Sparse high-dimensional data vectors are common in many application domains where a very large number of rarely non-zero features can be devised.

text-classification Text Classification +1

Paper
Add Code

Better Mixing via Deep Representations

no code implementations • 18 Jul 2012 • Yoshua Bengio, Grégoire Mesnil, Yann Dauphin, Salah Rifai

It has previously been hypothesized, and supported with some experimental evidence, that deeper representations, when well trained, tend to do a better job at disentangling the underlying factors of variation.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.