Search Results for author: Minh-Thang Luong

Found 37 papers, 23 papers with code

Enriching Biomedical Knowledge for Low-resource Language Through Large-Scale Translation

1 code implementation • 11 Oct 2022 • Long Phan, Tai Dang, Hieu Tran, Trieu H. Trinh, Vy Phan, Lam D. Chau, Minh-Thang Luong

Biomedical data and benchmarks are highly valuable yet very limited in low-resource languages other than English such as Vietnamese.

Decoder Translation

Paper
Code

MTet: Multi-domain Translation for English and Vietnamese

2 code implementations • 11 Oct 2022 • Chinh Ngo, Trieu H. Trinh, Long Phan, Hieu Tran, Tai Dang, Hieu Nguyen, Minh Nguyen, Minh-Thang Luong

We introduce MTet, the largest publicly available parallel corpus for English-Vietnamese translation.

Ranked #1 on Machine Translation on IWSLT2015 English-Vietnamese (using extra training data)

Machine Translation +2

172

Paper
Code

MTet: Multi-domain Translation for English-Vietnamese

1 code implementation • Blog 2022 • Chinh Ngo, Hieu Tran, Long Phan, Trieu H. Trinh, Hieu Nguyen, Minh Nguyen, Minh-Thang Luong

We are excited to introduce a new larger and better quality Machine Translation dataset, MTet, which stands for Multi-domain Translation for English and VieTnamese.

Machine Translation Sentence +1

172

Paper
Code

Combined Scaling for Zero-shot Transfer Learning

no code implementations • 19 Nov 2021 • Hieu Pham, Zihang Dai, Golnaz Ghiasi, Kenji Kawaguchi, Hanxiao Liu, Adams Wei Yu, Jiahui Yu, Yi-Ting Chen, Minh-Thang Luong, Yonghui Wu, Mingxing Tan, Quoc V. Le

Second, while increasing the dataset size and the model size has been the defacto method to improve the performance of deep learning models like BASIC, the effect of a large contrastive batch size on such contrastive-trained image-text models is not well-understood.

Ranked #3 on Zero-Shot Transfer Image Classification on ImageNet-Sketch

Classification Contrastive Learning +3

Paper
Add Code

Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference

no code implementations • Findings (EMNLP) 2021 • Sneha Kudugunta, Yanping Huang, Ankur Bapna, Maxim Krikun, Dmitry Lepikhin, Minh-Thang Luong, Orhan Firat

On WMT, our task-MoE with 32 experts (533M parameters) outperforms the best performing token-level MoE model (token-MoE) by +1. 0 BLEU on average across 30 language pairs.

Sentence

Paper
Add Code

STraTA: Self-Training with Task Augmentation for Better Few-shot Learning

1 code implementation • EMNLP 2021 • Tu Vu, Minh-Thang Luong, Quoc V. Le, Grady Simon, Mohit Iyyer

Despite their recent successes in tackling many NLP tasks, large-scale pre-trained language models do not perform as well in few-shot settings where only a handful of training examples are available.

Ranked #1 on Few-Shot NLI on SNLI (8 training examples per class)

Few-Shot Learning Few-Shot NLI +1

32,875

Paper
Code

Pre-Training Transformers as Energy-Based Cloze Models

1 code implementation • EMNLP 2020 • Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning

We introduce Electric, an energy-based cloze model for representation learning over text.

Representation Learning speech-recognition +1

2,296

Paper
Code

Towards Domain-Agnostic Contrastive Learning

no code implementations • 9 Nov 2020 • Vikas Verma, Minh-Thang Luong, Kenji Kawaguchi, Hieu Pham, Quoc V. Le

Despite recent success, most contrastive self-supervised learning methods are domain-specific, relying heavily on data augmentation techniques that require knowledge about a particular domain, such as image cropping and rotation.

Contrastive Learning Data Augmentation +3

Paper
Add Code

Meta Pseudo Labels

9 code implementations • CVPR 2021 • Hieu Pham, Zihang Dai, Qizhe Xie, Minh-Thang Luong, Quoc V. Le

We present Meta Pseudo Labels, a semi-supervised learning method that achieves a new state-of-the-art top-1 accuracy of 90. 2% on ImageNet, which is 1. 6% better than the existing state-of-the-art.

Ranked #1 on Semi-Supervised Image Classification on SVHN, 1000 labels

Meta-Learning Semi-Supervised Image Classification

32,879

Paper
Code

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

17 code implementations • ICLR 2020 • Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning

Then, instead of training a model that predicts the original identities of the corrupted tokens, we train a discriminative model that predicts whether each token in the corrupted input was replaced by a generator sample or not.

Ranked #7 on Question Answering on Quora Question Pairs

Language Modelling Masked Language Modeling +3

125,385

Paper
Code

Towards a Human-like Open-Domain Chatbot

2 code implementations • 27 Jan 2020 • Daniel Adiwardana, Minh-Thang Luong, David R. So, Jamie Hall, Noah Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng Lu, Quoc V. Le

We present Meena, a multi-turn open-domain chatbot trained end-to-end on data mined and filtered from public domain social media conversations.

Chatbot Specificity

Paper
Code

A Hybrid Morpheme-Word Representation for Machine Translation of Morphologically Rich Languages

no code implementations • 19 Nov 2019 • Minh-Thang Luong, Preslav Nakov, Min-Yen Kan

We propose a language-independent approach for improving statistical machine translation for morphologically rich languages using a hybrid morpheme-word representation where the basic unit of translation is the morpheme, but word boundaries are respected at all stages of the translation process.

Machine Translation Sentence +1

Paper
Add Code

Self-training with Noisy Student improves ImageNet classification

12 code implementations • CVPR 2020 • Qizhe Xie, Minh-Thang Luong, Eduard Hovy, Quoc V. Le

During the learning of the student, we inject noise such as dropout, stochastic depth, and data augmentation via RandAugment to the student so that the student generalizes better than the teacher.

Ranked #16 on Image Classification on ImageNet ReaL (using extra training data)

Data Augmentation General Classification +1

5,180

Paper
Code

Findings of the Third Workshop on Neural Generation and Translation

no code implementations • WS 2019 • Hiroaki Hayashi, Yusuke Oda, Alexandra Birch, Ioannis Konstas, Andrew Finch, Minh-Thang Luong, Graham Neubig, Katsuhito Sudoh

This document describes the findings of the Third Workshop on Neural Generation and Translation, held in concert with the annual conference of the Empirical Methods in Natural Language Processing (EMNLP 2019).

Machine Translation NMT +1

Paper
Add Code

BAM! Born-Again Multi-Task Networks for Natural Language Understanding

1 code implementation • ACL 2019 • Kevin Clark, Minh-Thang Luong, Urvashi Khandelwal, Christopher D. Manning, Quoc V. Le

It can be challenging to train multi-task neural networks that outperform or even match their single-task counterparts.

Knowledge Distillation Natural Language Understanding

32,875

Paper
Code

Selfie: Self-supervised Pretraining for Image Embedding

1 code implementation • 7 Jun 2019 • Trieu H. Trinh, Minh-Thang Luong, Quoc V. Le

Notably, on ImageNet 224 x 224 with 60 examples per class (5%), our method improves the mean accuracy of ResNet-50 from 35. 6% to 46. 7%, an improvement of 11. 1 points in absolute accuracy.

Language Modelling Masked Language Modeling

Paper
Code

Unsupervised Data Augmentation for Consistency Training

20 code implementations • NeurIPS 2020 • Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, Quoc V. Le

In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning.

Ranked #1 on Sentiment Analysis on Amazon Review Full

Image Augmentation Semi-Supervised Image Classification +2

2,168

Paper
Code

Semi-Supervised Sequence Modeling with Cross-View Training

2 code implementations • EMNLP 2018 • Kevin Clark, Minh-Thang Luong, Christopher D. Manning, Quoc V. Le

We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data.

Ranked #3 on CCG Supertagging on CCGbank

CCG Supertagging Dependency Parsing +7

76,616

Paper
Code

Latent Topic Conversational Models

no code implementations • ICLR 2018 • Tsung-Hsien Wen, Minh-Thang Luong

In this paper, we propose Latent Topic Conversational Model (LTCM) which augments seq2seq with a neural latent topic component to better guide response generation and make training easier.

Response Generation Sentence

Paper
Add Code

Findings of the Second Workshop on Neural Machine Translation and Generation

no code implementations • WS 2018 • Alexandra Birch, Andrew Finch, Minh-Thang Luong, Graham Neubig, Yusuke Oda

This document describes the findings of the Second Workshop on Neural Machine Translation and Generation, held in concert with the annual conference of the Association for Computational Linguistics (ACL 2018).

Data Augmentation Domain Adaptation +2

Paper
Add Code

QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

15 code implementations • ICLR 2018 • Adams Wei Yu, David Dohan, Minh-Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, Quoc V. Le

On the SQuAD dataset, our model is 3x to 13x faster in training and 4x to 9x faster in inference, while achieving equivalent accuracy to recurrent models.

Ranked #27 on Question Answering on SQuAD1.1 dev

Machine Translation Question Answering +2

120

Paper
Code

Learning Longer-term Dependencies in RNNs with Auxiliary Losses

1 code implementation • ICML 2018 • Trieu H. Trinh, Andrew M. Dai, Minh-Thang Luong, Quoc V. Le

Despite recent advances in training recurrent neural networks (RNNs), capturing long-term dependencies in sequences remains a fundamental challenge.

Ranked #13 on Sequential Image Classification on Sequential CIFAR-10

Document Classification General Classification +1

Paper
Code

EXPLORING NEURAL ARCHITECTURE SEARCH FOR LANGUAGE TASKS

no code implementations • ICLR 2018 • Minh-Thang Luong, David Dohan, Adams Wei Yu, Quoc V. Le, Barret Zoph, Vijay Vasudevan

Neural architecture search (NAS), the task of finding neural architectures automatically, has recently emerged as a promising approach for unveiling better models over human-designed ones.

Language Modelling Neural Architecture Search +2

Paper
Add Code

On the Effective Use of Pretraining for Natural Language Inference

no code implementations • 5 Oct 2017 • Ignacio Cases, Minh-Thang Luong, Christopher Potts

Neural networks have excelled at many NLP tasks, but there remain open questions about the performance of pretrained distributed word representations and their interaction with weight initialization and other hyperparameters.

Natural Language Inference

Paper
Add Code

Efficient Attention using a Fixed-Size Memory Representation

no code implementations • EMNLP 2017 • Denny Britz, Melody Y. Guan, Minh-Thang Luong

The standard content-based attention mechanism typically used in sequence-to-sequence models is computationally expensive as it requires the comparison of large encoder and decoder states at each time step.

Decoder Translation

Paper
Add Code

Online and Linear-Time Attention by Enforcing Monotonic Alignments

2 code implementations • ICML 2017 • Colin Raffel, Minh-Thang Luong, Peter J. Liu, Ron J. Weiss, Douglas Eck

Recurrent neural network models with an attention mechanism have proven to be extremely effective on a wide variety of sequence-to-sequence problems.

Ranked #20 on Speech Recognition on TIMIT

Machine Translation Sentence +4

Paper
Code

Massive Exploration of Neural Machine Translation Architectures

12 code implementations • EMNLP 2017 • Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc Le

Neural Machine Translation (NMT) has shown remarkable progress over the past few years with production systems now being deployed to end-users.

Machine Translation NMT +1

5,581

Paper
Code

Models and Inference for Prefix-Constrained Machine Translation

no code implementations • ACL 2016 • Joern Wuebker, Spence Green, John DeNero, Sa{\v{s}}a Hasan, Minh-Thang Luong

Machine Translation Translation

Paper
Add Code

Compression of Neural Machine Translation Models via Pruning

1 code implementation • CONLL 2016 • Abigail See, Minh-Thang Luong, Christopher D. Manning

Neural Machine Translation (NMT), like many other deep learning domains, typically suffers from over-parameterization, resulting in large storage sizes.

Machine Translation NMT +1

Paper
Code

Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models

3 code implementations • ACL 2016 • Minh-Thang Luong, Christopher D. Manning

We build hybrid systems that translate mostly at the word level and consult the character components for rare words.

Machine Translation NMT +1

Paper
Code

Stanford Neural Machine Translation Systems for Spoken Language Domains

1 code implementation • IWSLT 2015 2015 • Minh-Thang Luong, Christopher D. Manning

Neural Machine Translation (NMT), though recently developed, has shown promising results for various language pairs.

Ranked #10 on Machine Translation on IWSLT2015 English-Vietnamese

Machine Translation NMT +1

Paper
Code

Multi-task Sequence to Sequence Learning

no code implementations • 19 Nov 2015 • Minh-Thang Luong, Quoc V. Le, Ilya Sutskever, Oriol Vinyals, Lukasz Kaiser

This paper examines three multi-task learning (MTL) settings for sequence to sequence models: (a) the oneto-many setting - where the encoder is shared between several tasks such as machine translation and syntactic parsing, (b) the many-to-one setting - useful when only the decoder can be shared, as in the case of translation and image caption generation, and (c) the many-to-many setting - where multiple encoders and decoders are shared, which is the case with unsupervised objectives and translation.

Caption Generation Decoder +3

Paper
Add Code

Effective Approaches to Attention-based Neural Machine Translation

47 code implementations • EMNLP 2015 • Minh-Thang Luong, Hieu Pham, Christopher D. Manning

Our ensemble model using different attention architectures has established a new state-of-the-art result in the WMT'15 English to German translation task with 25. 9 BLEU points, an improvement of 1. 0 BLEU points over the existing best system backed by NMT and an n-gram reranker.

Ranked #1 on Machine Translation on 20NEWS (Accuracy metric)

Image-guided Story Ending Generation Machine Translation +3

2,799

Paper
Code

A Hierarchical Neural Autoencoder for Paragraphs and Documents

6 code implementations • IJCNLP 2015 • Jiwei Li, Minh-Thang Luong, Dan Jurafsky

Natural language generation of coherent long texts like paragraphs or longer documents is a challenging problem for recurrent networks models.

Sentence Text Generation

218

Paper
Code

When Are Tree Structures Necessary for Deep Learning of Representations?

no code implementations • EMNLP 2015 • Jiwei Li, Minh-Thang Luong, Dan Jurafsky, Eudard Hovy

Recursive neural models, which use syntactic parse trees to recursively generate representations bottom-up, are a popular architecture.

Discourse Parsing Relation +4

Paper
Add Code

Addressing the Rare Word Problem in Neural Machine Translation

5 code implementations • IJCNLP 2015 • Minh-Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, Wojciech Zaremba

Our experiments on the WMT14 English to French translation task show that this method provides a substantial improvement of up to 2. 8 BLEU points over an equivalent NMT system that does not use this technique.

Ranked #40 on Machine Translation on WMT2014 English-French

Machine Translation NMT +3

1,220

Paper
Code

Parsing entire discourses as very long strings: Capturing topic continuity in grounded language learning

no code implementations • TACL 2013 • Minh-Thang Luong, Michael C. Frank, Mark Johnson

Grounded language learning, the task of mapping from natural language to a representation of meaning, has attracted more and more interest in recent years.

Grounded language learning Sentence

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.