Search Results for author: Minh-Thang Luong

Found 32 papers, 16 papers with code

STraTA: Self-Training with Task Augmentation for Better Few-shot Learning

1 code implementation13 Sep 2021 Tu Vu, Minh-Thang Luong, Quoc V. Le, Grady Simon, Mohit Iyyer

Despite their recent successes in tackling many NLP tasks, large-scale pre-trained language models do not perform as well in few-shot settings where only a handful of training examples are available.

Few-Shot Learning Few-Shot NLI

Towards Domain-Agnostic Contrastive Learning

no code implementations9 Nov 2020 Vikas Verma, Minh-Thang Luong, Kenji Kawaguchi, Hieu Pham, Quoc V. Le

Despite recent success, most contrastive self-supervised learning methods are domain-specific, relying heavily on data augmentation techniques that require knowledge about a particular domain, such as image cropping and rotation.

Contrastive Learning Data Augmentation +3

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

13 code implementations ICLR 2020 Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning

Then, instead of training a model that predicts the original identities of the corrupted tokens, we train a discriminative model that predicts whether each token in the corrupted input was replaced by a generator sample or not.

Language Modelling Natural Language Understanding +2

Meta Pseudo Labels

7 code implementations CVPR 2021 Hieu Pham, Zihang Dai, Qizhe Xie, Minh-Thang Luong, Quoc V. Le

We present Meta Pseudo Labels, a semi-supervised learning method that achieves a new state-of-the-art top-1 accuracy of 90. 2% on ImageNet, which is 1. 6% better than the existing state-of-the-art.

Meta-Learning Semi-Supervised Image Classification

Towards a Human-like Open-Domain Chatbot

2 code implementations27 Jan 2020 Daniel Adiwardana, Minh-Thang Luong, David R. So, Jamie Hall, Noah Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng Lu, Quoc V. Le

We present Meena, a multi-turn open-domain chatbot trained end-to-end on data mined and filtered from public domain social media conversations.


A Hybrid Morpheme-Word Representation for Machine Translation of Morphologically Rich Languages

no code implementations19 Nov 2019 Minh-Thang Luong, Preslav Nakov, Min-Yen Kan

We propose a language-independent approach for improving statistical machine translation for morphologically rich languages using a hybrid morpheme-word representation where the basic unit of translation is the morpheme, but word boundaries are respected at all stages of the translation process.

Machine Translation

Self-training with Noisy Student improves ImageNet classification

10 code implementations CVPR 2020 Qizhe Xie, Minh-Thang Luong, Eduard Hovy, Quoc V. Le

During the learning of the student, we inject noise such as dropout, stochastic depth, and data augmentation via RandAugment to the student so that the student generalizes better than the teacher.

Ranked #7 on Image Classification on ImageNet ReaL (using extra training data)

Classification Data Augmentation +2

Findings of the Third Workshop on Neural Generation and Translation

no code implementations WS 2019 Hiroaki Hayashi, Yusuke Oda, Alexandra Birch, Ioannis Konstas, Andrew Finch, Minh-Thang Luong, Graham Neubig, Katsuhito Sudoh

This document describes the findings of the Third Workshop on Neural Generation and Translation, held in concert with the annual conference of the Empirical Methods in Natural Language Processing (EMNLP 2019).

Document-level Machine Translation

Selfie: Self-supervised Pretraining for Image Embedding

1 code implementation7 Jun 2019 Trieu H. Trinh, Minh-Thang Luong, Quoc V. Le

Notably, on ImageNet 224 x 224 with 60 examples per class (5%), our method improves the mean accuracy of ResNet-50 from 35. 6% to 46. 7%, an improvement of 11. 1 points in absolute accuracy.

Language Modelling

Unsupervised Data Augmentation for Consistency Training

16 code implementations NeurIPS 2020 Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, Quoc V. Le

In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning.

Image Augmentation Semi-Supervised Image Classification +2

Semi-Supervised Sequence Modeling with Cross-View Training

2 code implementations EMNLP 2018 Kevin Clark, Minh-Thang Luong, Christopher D. Manning, Quoc V. Le

We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data.

CCG Supertagging Dependency Parsing +5

Latent Topic Conversational Models

no code implementations ICLR 2018 Tsung-Hsien Wen, Minh-Thang Luong

In this paper, we propose Latent Topic Conversational Model (LTCM) which augments seq2seq with a neural latent topic component to better guide response generation and make training easier.

Latent Variable Models

Findings of the Second Workshop on Neural Machine Translation and Generation

no code implementations WS 2018 Alexandra Birch, Andrew Finch, Minh-Thang Luong, Graham Neubig, Yusuke Oda

This document describes the findings of the Second Workshop on Neural Machine Translation and Generation, held in concert with the annual conference of the Association for Computational Linguistics (ACL 2018).

Data Augmentation Domain Adaptation +1


no code implementations ICLR 2018 Minh-Thang Luong, David Dohan, Adams Wei Yu, Quoc V. Le, Barret Zoph, Vijay Vasudevan

Neural architecture search (NAS), the task of finding neural architectures automatically, has recently emerged as a promising approach for unveiling better models over human-designed ones.

Language Modelling Neural Architecture Search +1

On the Effective Use of Pretraining for Natural Language Inference

no code implementations5 Oct 2017 Ignacio Cases, Minh-Thang Luong, Christopher Potts

Neural networks have excelled at many NLP tasks, but there remain open questions about the performance of pretrained distributed word representations and their interaction with weight initialization and other hyperparameters.

Natural Language Inference

Efficient Attention using a Fixed-Size Memory Representation

no code implementations EMNLP 2017 Denny Britz, Melody Y. Guan, Minh-Thang Luong

The standard content-based attention mechanism typically used in sequence-to-sequence models is computationally expensive as it requires the comparison of large encoder and decoder states at each time step.

Online and Linear-Time Attention by Enforcing Monotonic Alignments

2 code implementations ICML 2017 Colin Raffel, Minh-Thang Luong, Peter J. Liu, Ron J. Weiss, Douglas Eck

Recurrent neural network models with an attention mechanism have proven to be extremely effective on a wide variety of sequence-to-sequence problems.

Machine Translation Sentence Summarization +1

Massive Exploration of Neural Machine Translation Architectures

11 code implementations EMNLP 2017 Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc Le

Neural Machine Translation (NMT) has shown remarkable progress over the past few years with production systems now being deployed to end-users.

Machine Translation

Compression of Neural Machine Translation Models via Pruning

no code implementations CONLL 2016 Abigail See, Minh-Thang Luong, Christopher D. Manning

Neural Machine Translation (NMT), like many other deep learning domains, typically suffers from over-parameterization, resulting in large storage sizes.

Machine Translation

Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models

3 code implementations ACL 2016 Minh-Thang Luong, Christopher D. Manning

We build hybrid systems that translate mostly at the word level and consult the character components for rare words.

Machine Translation

Multi-task Sequence to Sequence Learning

no code implementations19 Nov 2015 Minh-Thang Luong, Quoc V. Le, Ilya Sutskever, Oriol Vinyals, Lukasz Kaiser

This paper examines three multi-task learning (MTL) settings for sequence to sequence models: (a) the oneto-many setting - where the encoder is shared between several tasks such as machine translation and syntactic parsing, (b) the many-to-one setting - useful when only the decoder can be shared, as in the case of translation and image caption generation, and (c) the many-to-many setting - where multiple encoders and decoders are shared, which is the case with unsupervised objectives and translation.

Machine Translation Multi-Task Learning

Effective Approaches to Attention-based Neural Machine Translation

43 code implementations EMNLP 2015 Minh-Thang Luong, Hieu Pham, Christopher D. Manning

Our ensemble model using different attention architectures has established a new state-of-the-art result in the WMT'15 English to German translation task with 25. 9 BLEU points, an improvement of 1. 0 BLEU points over the existing best system backed by NMT and an n-gram reranker.

 Ranked #1 on Machine Translation on 20NEWS (Accuracy metric)

Machine Translation

A Hierarchical Neural Autoencoder for Paragraphs and Documents

6 code implementations IJCNLP 2015 Jiwei Li, Minh-Thang Luong, Dan Jurafsky

Natural language generation of coherent long texts like paragraphs or longer documents is a challenging problem for recurrent networks models.

Text Generation

When Are Tree Structures Necessary for Deep Learning of Representations?

no code implementations EMNLP 2015 Jiwei Li, Minh-Thang Luong, Dan Jurafsky, Eudard Hovy

Recursive neural models, which use syntactic parse trees to recursively generate representations bottom-up, are a popular architecture.

Discourse Parsing Relation Extraction +1

Addressing the Rare Word Problem in Neural Machine Translation

5 code implementations IJCNLP 2015 Minh-Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, Wojciech Zaremba

Our experiments on the WMT14 English to French translation task show that this method provides a substantial improvement of up to 2. 8 BLEU points over an equivalent NMT system that does not use this technique.

Machine Translation Word Alignment

Parsing entire discourses as very long strings: Capturing topic continuity in grounded language learning

no code implementations TACL 2013 Minh-Thang Luong, Michael C. Frank, Mark Johnson

Grounded language learning, the task of mapping from natural language to a representation of meaning, has attracted more and more interest in recent years.

Grounded language learning

Cannot find the paper you are looking for? You can Submit a new open access paper.