Previous research has used linguistic features to show that translations exhibit traces of source language interference and that phylogenetic trees between languages can be reconstructed from the results of translations into the same language.
We show how we can eliminate the need for parallel validation data by combining the self-supervised loss with an unsupervised loss.
Cross-lingual natural language processing relies on translation, either by humans or machines, at different levels, from translating training data to translating test sets.
We describe the EdinSaar submission to the shared task of Multilingual Low-Resource Translation for North Germanic Languages at the Sixth Conference on Machine Translation (WMT2021).
Traditional hand-crafted linguistically-informed features have often been used for distinguishing between translated and original non-translated texts.
Recent studies use a combination of lexical and syntactic features to show that footprints of the source language remain visible in translations, to the extent that it is possible to predict the original source language from the translation.
Some translationese features tend to appear in simultaneous interpreting with higher frequency than in human text translation, but the reasons for this are unclear.
It is assumed that multimodal machine translation systems are better than text-only systems at translating phrases that have a direct correspondence in the image.
This paper presents the system description of Machine Translation (MT) system(s) for Indic Languages Multilingual Task for the 2018 edition of the WAT Shared Task.
In this paper, we investigate the effectiveness of training a multimodal neural machine translation (MNMT) system with image features for a low-resource language pair, Hindi and English, using synthetic data.
In this paper, we analyse the real world samples of customer feedback from Microsoft Office customers in four languages, i. e., English, French, Spanish and Japanese and conclude a five-plus-one-classes categorisation (comment, request, bug, complaint, meaningless and undetermined) for meaning classification.
A description of a system for identifying Verbal Multi-Word Expressions (VMWEs) in running text is presented.