1 code implementation • 11 Oct 2022 • Long Phan, Tai Dang, Hieu Tran, Trieu H. Trinh, Vy Phan, Lam D. Chau, Minh-Thang Luong
Biomedical data and benchmarks are highly valuable yet very limited in low-resource languages other than English such as Vietnamese.
2 code implementations • 11 Oct 2022 • Chinh Ngo, Trieu H. Trinh, Long Phan, Hieu Tran, Tai Dang, Hieu Nguyen, Minh Nguyen, Minh-Thang Luong
We introduce MTet, the largest publicly available parallel corpus for English-Vietnamese translation.
Ranked #1 on Machine Translation on IWSLT2015 English-Vietnamese (using extra training data)
1 code implementation • Blog 2022 • Chinh Ngo, Hieu Tran, Long Phan, Trieu H. Trinh, Hieu Nguyen, Minh Nguyen, Minh-Thang Luong
We are excited to introduce a new larger and better quality Machine Translation dataset, MTet, which stands for Multi-domain Translation for English and VieTnamese.
no code implementations • 19 Nov 2021 • Hieu Pham, Zihang Dai, Golnaz Ghiasi, Kenji Kawaguchi, Hanxiao Liu, Adams Wei Yu, Jiahui Yu, Yi-Ting Chen, Minh-Thang Luong, Yonghui Wu, Mingxing Tan, Quoc V. Le
Second, while increasing the dataset size and the model size has been the defacto method to improve the performance of deep learning models like BASIC, the effect of a large contrastive batch size on such contrastive-trained image-text models is not well-understood.
no code implementations • Findings (EMNLP) 2021 • Sneha Kudugunta, Yanping Huang, Ankur Bapna, Maxim Krikun, Dmitry Lepikhin, Minh-Thang Luong, Orhan Firat
On WMT, our task-MoE with 32 experts (533M parameters) outperforms the best performing token-level MoE model (token-MoE) by +1. 0 BLEU on average across 30 language pairs.
1 code implementation • EMNLP 2021 • Tu Vu, Minh-Thang Luong, Quoc V. Le, Grady Simon, Mohit Iyyer
Despite their recent successes in tackling many NLP tasks, large-scale pre-trained language models do not perform as well in few-shot settings where only a handful of training examples are available.
Ranked #1 on Few-Shot NLI on SNLI (8 training examples per class)
1 code implementation • EMNLP 2020 • Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning
We introduce Electric, an energy-based cloze model for representation learning over text.
no code implementations • 9 Nov 2020 • Vikas Verma, Minh-Thang Luong, Kenji Kawaguchi, Hieu Pham, Quoc V. Le
Despite recent success, most contrastive self-supervised learning methods are domain-specific, relying heavily on data augmentation techniques that require knowledge about a particular domain, such as image cropping and rotation.
9 code implementations • CVPR 2021 • Hieu Pham, Zihang Dai, Qizhe Xie, Minh-Thang Luong, Quoc V. Le
We present Meta Pseudo Labels, a semi-supervised learning method that achieves a new state-of-the-art top-1 accuracy of 90. 2% on ImageNet, which is 1. 6% better than the existing state-of-the-art.
17 code implementations • ICLR 2020 • Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning
Then, instead of training a model that predicts the original identities of the corrupted tokens, we train a discriminative model that predicts whether each token in the corrupted input was replaced by a generator sample or not.
Ranked #7 on Question Answering on Quora Question Pairs
2 code implementations • 27 Jan 2020 • Daniel Adiwardana, Minh-Thang Luong, David R. So, Jamie Hall, Noah Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng Lu, Quoc V. Le
We present Meena, a multi-turn open-domain chatbot trained end-to-end on data mined and filtered from public domain social media conversations.
no code implementations • 19 Nov 2019 • Minh-Thang Luong, Preslav Nakov, Min-Yen Kan
We propose a language-independent approach for improving statistical machine translation for morphologically rich languages using a hybrid morpheme-word representation where the basic unit of translation is the morpheme, but word boundaries are respected at all stages of the translation process.
12 code implementations • CVPR 2020 • Qizhe Xie, Minh-Thang Luong, Eduard Hovy, Quoc V. Le
During the learning of the student, we inject noise such as dropout, stochastic depth, and data augmentation via RandAugment to the student so that the student generalizes better than the teacher.
Ranked #16 on Image Classification on ImageNet ReaL (using extra training data)
no code implementations • WS 2019 • Hiroaki Hayashi, Yusuke Oda, Alexandra Birch, Ioannis Konstas, Andrew Finch, Minh-Thang Luong, Graham Neubig, Katsuhito Sudoh
This document describes the findings of the Third Workshop on Neural Generation and Translation, held in concert with the annual conference of the Empirical Methods in Natural Language Processing (EMNLP 2019).
1 code implementation • ACL 2019 • Kevin Clark, Minh-Thang Luong, Urvashi Khandelwal, Christopher D. Manning, Quoc V. Le
It can be challenging to train multi-task neural networks that outperform or even match their single-task counterparts.
1 code implementation • 7 Jun 2019 • Trieu H. Trinh, Minh-Thang Luong, Quoc V. Le
Notably, on ImageNet 224 x 224 with 60 examples per class (5%), our method improves the mean accuracy of ResNet-50 from 35. 6% to 46. 7%, an improvement of 11. 1 points in absolute accuracy.
20 code implementations • NeurIPS 2020 • Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, Quoc V. Le
In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning.
Ranked #1 on Sentiment Analysis on Amazon Review Full
2 code implementations • EMNLP 2018 • Kevin Clark, Minh-Thang Luong, Christopher D. Manning, Quoc V. Le
We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data.
Ranked #3 on CCG Supertagging on CCGbank
no code implementations • ICLR 2018 • Tsung-Hsien Wen, Minh-Thang Luong
In this paper, we propose Latent Topic Conversational Model (LTCM) which augments seq2seq with a neural latent topic component to better guide response generation and make training easier.
no code implementations • WS 2018 • Alexandra Birch, Andrew Finch, Minh-Thang Luong, Graham Neubig, Yusuke Oda
This document describes the findings of the Second Workshop on Neural Machine Translation and Generation, held in concert with the annual conference of the Association for Computational Linguistics (ACL 2018).
15 code implementations • ICLR 2018 • Adams Wei Yu, David Dohan, Minh-Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, Quoc V. Le
On the SQuAD dataset, our model is 3x to 13x faster in training and 4x to 9x faster in inference, while achieving equivalent accuracy to recurrent models.
Ranked #27 on Question Answering on SQuAD1.1 dev
1 code implementation • ICML 2018 • Trieu H. Trinh, Andrew M. Dai, Minh-Thang Luong, Quoc V. Le
Despite recent advances in training recurrent neural networks (RNNs), capturing long-term dependencies in sequences remains a fundamental challenge.
Ranked #13 on Sequential Image Classification on Sequential CIFAR-10
no code implementations • ICLR 2018 • Minh-Thang Luong, David Dohan, Adams Wei Yu, Quoc V. Le, Barret Zoph, Vijay Vasudevan
Neural architecture search (NAS), the task of finding neural architectures automatically, has recently emerged as a promising approach for unveiling better models over human-designed ones.
no code implementations • 5 Oct 2017 • Ignacio Cases, Minh-Thang Luong, Christopher Potts
Neural networks have excelled at many NLP tasks, but there remain open questions about the performance of pretrained distributed word representations and their interaction with weight initialization and other hyperparameters.
no code implementations • EMNLP 2017 • Denny Britz, Melody Y. Guan, Minh-Thang Luong
The standard content-based attention mechanism typically used in sequence-to-sequence models is computationally expensive as it requires the comparison of large encoder and decoder states at each time step.
2 code implementations • ICML 2017 • Colin Raffel, Minh-Thang Luong, Peter J. Liu, Ron J. Weiss, Douglas Eck
Recurrent neural network models with an attention mechanism have proven to be extremely effective on a wide variety of sequence-to-sequence problems.
Ranked #20 on Speech Recognition on TIMIT
12 code implementations • EMNLP 2017 • Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc Le
Neural Machine Translation (NMT) has shown remarkable progress over the past few years with production systems now being deployed to end-users.
1 code implementation • CONLL 2016 • Abigail See, Minh-Thang Luong, Christopher D. Manning
Neural Machine Translation (NMT), like many other deep learning domains, typically suffers from over-parameterization, resulting in large storage sizes.
3 code implementations • ACL 2016 • Minh-Thang Luong, Christopher D. Manning
We build hybrid systems that translate mostly at the word level and consult the character components for rare words.
1 code implementation • IWSLT 2015 2015 • Minh-Thang Luong, Christopher D. Manning
Neural Machine Translation (NMT), though recently developed, has shown promising results for various language pairs.
Ranked #10 on Machine Translation on IWSLT2015 English-Vietnamese
no code implementations • 19 Nov 2015 • Minh-Thang Luong, Quoc V. Le, Ilya Sutskever, Oriol Vinyals, Lukasz Kaiser
This paper examines three multi-task learning (MTL) settings for sequence to sequence models: (a) the oneto-many setting - where the encoder is shared between several tasks such as machine translation and syntactic parsing, (b) the many-to-one setting - useful when only the decoder can be shared, as in the case of translation and image caption generation, and (c) the many-to-many setting - where multiple encoders and decoders are shared, which is the case with unsupervised objectives and translation.
47 code implementations • EMNLP 2015 • Minh-Thang Luong, Hieu Pham, Christopher D. Manning
Our ensemble model using different attention architectures has established a new state-of-the-art result in the WMT'15 English to German translation task with 25. 9 BLEU points, an improvement of 1. 0 BLEU points over the existing best system backed by NMT and an n-gram reranker.
Ranked #1 on Machine Translation on 20NEWS (Accuracy metric)
6 code implementations • IJCNLP 2015 • Jiwei Li, Minh-Thang Luong, Dan Jurafsky
Natural language generation of coherent long texts like paragraphs or longer documents is a challenging problem for recurrent networks models.
no code implementations • EMNLP 2015 • Jiwei Li, Minh-Thang Luong, Dan Jurafsky, Eudard Hovy
Recursive neural models, which use syntactic parse trees to recursively generate representations bottom-up, are a popular architecture.
5 code implementations • IJCNLP 2015 • Minh-Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, Wojciech Zaremba
Our experiments on the WMT14 English to French translation task show that this method provides a substantial improvement of up to 2. 8 BLEU points over an equivalent NMT system that does not use this technique.
Ranked #40 on Machine Translation on WMT2014 English-French
no code implementations • TACL 2013 • Minh-Thang Luong, Michael C. Frank, Mark Johnson
Grounded language learning, the task of mapping from natural language to a representation of meaning, has attracted more and more interest in recent years.