Search Results for author: David Grangier

Found 53 papers, 24 papers with code

Efficient softmax approximation for GPUs

12 code implementations ICML 2017 Edouard Grave, Armand Joulin, Moustapha Cissé, David Grangier, Hervé Jégou

We propose an approximate strategy to efficiently train neural network based language models over very large vocabularies.

Toward Better Storylines with Sentence-Level Language Models

1 code implementation ACL 2020 Daphne Ippolito, David Grangier, Douglas Eck, Chris Callison-Burch

We propose a sentence-level language model which selects the next sentence in a story from a finite set of fluent alternatives.

Language Modelling Sentence +2

Contrastive Learning of General-Purpose Audio Representations

2 code implementations21 Oct 2020 Aaqib Saeed, David Grangier, Neil Zeghidour

We introduce COLA, a self-supervised pre-training approach for learning a general-purpose representation of audio.

CoLA Contrastive Learning +2

Scaling Neural Machine Translation

5 code implementations WS 2018 Myle Ott, Sergey Edunov, David Grangier, Michael Auli

Sequence to sequence learning models still require several days to reach state of the art performance on large benchmark datasets using a single machine.

Machine Translation Question Answering +1

Understanding Back-Translation at Scale

3 code implementations EMNLP 2018 Sergey Edunov, Myle Ott, Michael Auli, David Grangier

An effective method to improve neural machine translation with monolingual data is to augment the parallel training corpus with back-translations of target language sentences.

Ranked #2 on Machine Translation on WMT2014 English-German (using extra training data)

Machine Translation Translation

fairseq: A Fast, Extensible Toolkit for Sequence Modeling

6 code implementations NAACL 2019 Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, Michael Auli

fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks.

Language Modelling Text Generation +1

Classical Structured Prediction Losses for Sequence to Sequence Learning

1 code implementation NAACL 2018 Sergey Edunov, Myle Ott, Michael Auli, David Grangier, Marc'Aurelio Ranzato

There has been much recent work on training neural attention models at the sequence-level using either reinforcement learning-style methods or by optimizing the beam.

Abstractive Text Summarization Machine Translation +3

Modeling Human Motion with Quaternion-based Neural Networks

1 code implementation21 Jan 2019 Dario Pavllo, Christoph Feichtenhofer, Michael Auli, David Grangier

Previous work on predicting or generating 3D human pose sequences regresses either joint rotations or joint positions.

Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation

3 code implementations29 Apr 2021 Markus Freitag, George Foster, David Grangier, Viresh Ratnakar, Qijun Tan, Wolfgang Macherey

Human evaluation of modern high-quality machine translation systems is a difficult problem, and there is increasing evidence that inadequate evaluation procedures can lead to erroneous conclusions.

Machine Translation Translation

Efficient Content-Based Sparse Attention with Routing Transformers

2 code implementations12 Mar 2020 Aurko Roy, Mohammad Saffar, Ashish Vaswani, David Grangier

This work builds upon two lines of research: it combines the modeling flexibility of prior work on content-based sparse attention with the efficiency gains from approaches based on local, temporal sparse attention.

Ranked #5 on Image Generation on ImageNet 64x64 (Bits per dim metric)

Image Generation Language Modelling

ELI5: Long Form Question Answering

3 code implementations ACL 2019 Angela Fan, Yacine Jernite, Ethan Perez, David Grangier, Jason Weston, Michael Auli

We introduce the first large-scale corpus for long-form question answering, a task requiring elaborate and in-depth answers to open-ended questions.

Language Modelling Long Form Question Answering +2

Learning strides in convolutional neural networks

1 code implementation ICLR 2022 Rachid Riad, Olivier Teboul, David Grangier, Neil Zeghidour

In particular, we show that introducing our layer into a ResNet-18 architecture allows keeping consistent high performance on CIFAR10, CIFAR100 and ImageNet even when training starts from poor random stride configurations.

Image Classification

Strategies for Training Large Vocabulary Neural Language Models

2 code implementations ACL 2016 Welin Chen, David Grangier, Michael Auli

Training neural network language models over large vocabularies is still computationally very costly compared to count-based models such as Kneser-Ney.

Machine Translation speech-recognition +2

Analyzing Uncertainty in Neural Machine Translation

1 code implementation ICML 2018 Myle Ott, Michael Auli, David Grangier, Marc'Aurelio Ranzato

We propose tools and metrics to assess how uncertainty in the data is captured by the model distribution and how it affects search strategies that generate translations.

Machine Translation Sentence +2

BLEU might be Guilty but References are not Innocent

2 code implementations EMNLP 2020 Markus Freitag, David Grangier, Isaac Caswell

The quality of automatic metrics for machine translation has been increasingly called into question, especially for high-quality systems.

Machine Translation Translation

Human-Paraphrased References Improve Neural Machine Translation

1 code implementation WMT (EMNLP) 2020 Markus Freitag, George Foster, David Grangier, Colin Cherry

When used in place of original references, the paraphrased versions produce metric scores that correlate better with human judgment.

Machine Translation NMT +1

Auxiliary Task Update Decomposition: The Good, The Bad and The Neutral

1 code implementation ICLR 2021 Lucio M. Dery, Yann Dauphin, David Grangier

In this case, careful consideration is needed to select tasks and model parameterizations such that updates from the auxiliary tasks actually help the primary task.

Image Classification

Controllable Abstractive Summarization

no code implementations WS 2018 Angela Fan, David Grangier, Michael Auli

Current models for document summarization disregard user preferences such as the desired length, style, the entities that the user might be interested in, or how much of the document the user has already read.

Abstractive Text Summarization Document Summarization

Iterative Refinement for Machine Translation

no code implementations20 Oct 2016 Roman Novak, Michael Auli, David Grangier

Existing machine translation decoding algorithms generate translations in a strictly monotonic fashion and never revisit previous decisions.

Machine Translation Sentence +1

Vocabulary Selection Strategies for Neural Machine Translation

no code implementations1 Oct 2016 Gurvan L'Hostis, David Grangier, Michael Auli

Classical translation models constrain the space of possible outputs by selecting a subset of translation rules based on the input sentence.

Machine Translation Sentence +1

Predicting distributions with Linearizing Belief Networks

no code implementations17 Nov 2015 Yann N. Dauphin, David Grangier

Contrary to a classical neural network, a belief network can predict more than the expected value of the output $Y$ given the input $X$.

Facial expression generation Image Denoising +1

Label Embedding Trees for Large Multi-Class Tasks

no code implementations NeurIPS 2010 Samy Bengio, Jason Weston, David Grangier

Multi-class classification becomes challenging at test time when the number of classes is very large and testing against every possible class can become computationally infeasible.

General Classification Multi-class Classification

Feature Set Embedding for Incomplete Data

no code implementations NeurIPS 2010 David Grangier, Iain Melvin

Our proposal maps (feature, value) pairs into an embedding space and then non-linearly combines the set of embedded vectors.

Classification General Classification

Polynomial Semantic Indexing

no code implementations NeurIPS 2009 Bing Bai, Jason Weston, David Grangier, Ronan Collobert, Kunihiko Sadamasa, Yanjun Qi, Corinna Cortes, Mehryar Mohri

We present a class of nonlinear (polynomial) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score.

Retrieval

Unsupervised Paraphrasing without Translation

no code implementations ACL 2019 Aurko Roy, David Grangier

We compare with MT-based approaches on paraphrase identification, generation, and training augmentation.

Machine Translation Paraphrase Identification +1

Tagged Back-Translation

no code implementations WS 2019 Isaac Caswell, Ciprian Chelba, David Grangier

Recent work in Neural Machine Translation (NMT) has shown significant quality gains from noised-beam decoding during back-translation, a method to generate synthetic parallel data.

Machine Translation NMT +1

Translationese as a Language in "Multilingual" NMT

no code implementations10 Nov 2019 Parker Riley, Isaac Caswell, Markus Freitag, David Grangier

Machine translation has an undesirable propensity to produce "translationese" artifacts, which can lead to higher BLEU scores while being liked less by human raters.

Machine Translation NMT +3

Wavesplit: End-to-End Speech Separation by Speaker Clustering

no code implementations20 Feb 2020 Neil Zeghidour, David Grangier

Wavesplit infers a set of source representations via clustering, which addresses the fundamental permutation problem of separation.

Clustering Data Augmentation +1

Translationese as a Language in ``Multilingual'' NMT

no code implementations ACL 2020 Parker Riley, Isaac Caswell, Markus Freitag, David Grangier

Machine translation has an undesirable propensity to produce {``}translationese{''} artifacts, which can lead to higher BLEU scores while being liked less by human raters.

Machine Translation NMT +3

DIVE: End-to-end Speech Diarization via Iterative Speaker Embedding

no code implementations28 May 2021 Neil Zeghidour, Olivier Teboul, David Grangier

Our neural algorithm presents the diarization task as an iterative process: it repeatedly builds a representation for each speaker before predicting the voice activity of each speaker conditioned on the extracted representations.

speaker-diarization Speaker Diarization

On the Complementarity of Data Selection and Fine Tuning for Domain Adaptation

no code implementations15 Sep 2021 Dan Iter, David Grangier

Domain adaptation of neural networks commonly relies on three training phases: pretraining, selected data training and then fine tuning.

Domain Generalization Language Modelling +2

High Quality Rather than High Model Probability: Minimum Bayes Risk Decoding with Neural Metrics

no code implementations17 Nov 2021 Markus Freitag, David Grangier, Qijun Tan, Bowen Liang

In Neural Machine Translation, it is typically assumed that the sentence with the highest estimated probability should also be the translation with the highest quality as measured by humans.

Machine Translation Sentence +2

A Natural Diet: Towards Improving Naturalness of Machine Translation Output

no code implementations Findings (ACL) 2022 Markus Freitag, David Vilar, David Grangier, Colin Cherry, George Foster

In this work we propose a method for training MT systems to achieve a more natural style, i. e. mirroring the style of text originally written in the target language.

Machine Translation Sentence +1

High-Resource Methodological Bias in Low-Resource Investigations

no code implementations14 Nov 2022 Maartje ter Hoeve, David Grangier, Natalie Schluter

The central bottleneck for low-resource NLP is typically regarded to be the quantity of accessible data, overlooking the contribution of data quality.

Machine Translation POS +3

Transfer Learning for Structured Pruning under Limited Task Data

no code implementations10 Nov 2023 Lucio Dery, David Grangier, Awni Hannun

We propose a framework which combines structured pruning with transfer learning to reduce the need for task-specific data.

Transfer Learning

Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

no code implementations29 Jan 2024 Pratyush Maini, Skyler Seto, He Bai, David Grangier, Yizhe Zhang, Navdeep Jaitly

Large language models are trained on massive scrapes of the web, which are often unstructured, noisy, and poorly phrased.

Language Modelling

Specialized Language Models with Cheap Inference from Limited Domain Data

no code implementations2 Feb 2024 David Grangier, Angelos Katharopoulos, Pierre Ablin, Awni Hannun

Large language models have emerged as a versatile tool but are challenging to apply to tasks lacking large inference budgets and large in-domain training sets.

 Ranked #1 on Language Modelling on The Pile (Test perplexity metric)

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.