Search Results for author: Nguyen Luong Tran

Found 3 papers, 3 papers with code

A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation

1 code implementation • 8 Aug 2022 • Linh The Nguyen, Nguyen Luong Tran, Long Doan, Manh Luong, Dat Quoc Nguyen

In this paper, we introduce a high-quality and large-scale benchmark dataset for English-Vietnamese speech translation with 508 audio hours, consisting of 331K triplets of (sentence-lengthed audio, English source transcript sentence, Vietnamese target subtitle sentence).

Sentence Translation

Paper
Code

PhoMT: A High-Quality and Large-Scale Benchmark Dataset for Vietnamese-English Machine Translation

1 code implementation • EMNLP 2021 • Long Doan, Linh The Nguyen, Nguyen Luong Tran, Thai Hoang, Dat Quoc Nguyen

We introduce a high-quality and large-scale Vietnamese-English parallel dataset of 3. 02M sentence pairs, which is 2. 9M pairs larger than the benchmark Vietnamese-English machine translation corpus IWSLT15.

Denoising Machine Translation +2

Paper
Code

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

2 code implementations • 20 Sep 2021 • Nguyen Luong Tran, Duong Minh Le, Dat Quoc Nguyen

We present BARTpho with two versions, BARTpho-syllable and BARTpho-word, which are the first public large-scale monolingual sequence-to-sequence models pre-trained for Vietnamese.

Ranked #3 on Abstractive Text Summarization on vietnews

Abstractive Text Summarization Denoising +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.