Better Translation for Vietnamese
We collect data from open sources on the Internet, and classify them into different categories, each labeled with a specific language style 3. In total, there are 3.3 million pairs of English and Vietnamese texts, ranging from single sentences to paragraphs. A model trained with our dataset outperforms Google Translate on a selected set of diverse text sources. On IWSLT'15 we achieved a BLEU score of 37.84.
PDFCode
Datasets
Results from the Paper
Ranked #2 on Machine Translation on IWSLT2015 English-Vietnamese (using extra training data)
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Uses Extra Training Data |
Benchmark |
---|---|---|---|---|---|---|---|
Machine Translation | IWSLT2015 English-Vietnamese | Tall Transformer with Style-Augmented Training | BLEU | 37.8 | # 2 |