Better Translation for Vietnamese

20 Apr 2021  ·  Chinh Ngo, Trieu Trinh ·

We collect data from open sources on the Internet, and classify them into different categories, each labeled with a specific language style 3. In total, there are 3.3 million pairs of English and Vietnamese texts, ranging from single sentences to paragraphs. A model trained with our dataset outperforms Google Translate on a selected set of diverse text sources. On IWSLT'15 we achieved a BLEU score of 37.84.

PDF

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Benchmark
Machine Translation IWSLT2015 English-Vietnamese Tall Transformer with Style-Augmented Training BLEU 37.8 # 2

Methods


No methods listed for this paper. Add relevant methods here