no code implementations • WS 2018 • Catarina Cruz Silva, Chao-Hong Liu, Alberto Poncelas, Andy Way
Data selection is a process used in selecting a subset of parallel data for the training of machine translation (MT) systems, so that 1) resources for training might be reduced, 2) trained models could perform better than those trained with the whole corpus, and/or 3) trained models are more tailored to specific domains.