no code implementations • LREC 2022 • Prajit Dhar, Arianna Bisazza, Gertjan van Noord
We conduct our evaluation on four typologically diverse target MRLs, and find that PT-Inflect surpasses NMT systems trained only on parallel data.
no code implementations • WMT (EMNLP) 2020 • Lukas Edman, Antonio Toral, Gertjan van Noord
This paper describes the methods behind the systems submitted by the University of Groningen for the WMT 2020 Unsupervised Machine Translation task for German–Upper Sorbian.
no code implementations • WMT (EMNLP) 2021 • Lukas Edman, Ahmet Üstün, Antonio Toral, Gertjan van Noord
This paper describes the methods behind the systems submitted by the University of Groningen for the WMT 2021 Unsupervised Machine Translation task for German–Lower Sorbian (DE–DSB): a high-resource language to a low-resource one.
no code implementations • CL (ACL) 2022 • Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord
To address this, we propose a novel language adaptation approach by introducing contextual language adapters to a multilingual parser.
no code implementations • ACL (WAT) 2021 • Prajit Dhar, Arianna Bisazza, Gertjan van Noord
Dravidian languages, such as Kannada and Tamil, are notoriously difficult to translate by state-of-the-art neural models.
no code implementations • WMT (EMNLP) 2020 • Prajit Dhar, Arianna Bisazza, Gertjan van Noord
This paper describes our submission for the English-Tamil news translation task of WMT-2020.
1 code implementation • EAMT 2020 • Lukas Edman, Antonio Toral, Gertjan van Noord
Unsupervised Machine Translation has been advancing our ability to translate without parallel data, but state-of-the-art methods assume an abundance of monolingual data.
no code implementations • GWC 2018 • Dieke Oele, Gertjan van Noord
The results of our experiments show that by lexically extending the amount of words in the gloss and context, although it works well for other implementations of Lesk, harms our method.
1 code implementation • 28 Feb 2023 • Lukas Edman, Gabriele Sarti, Antonio Toral, Gertjan van Noord, Arianna Bisazza
Pretrained character-level and byte-level language models have been shown to be competitive with popular subword models across a range of Natural Language Processing (NLP) tasks.
1 code implementation • 2 Dec 2022 • Lukas Edman, Antonio Toral, Gertjan van Noord
This new downsampling method not only outperforms existing downsampling methods, showing that downsampling characters can be done without sacrificing quality, but also leads to promising performance compared to subword models for translation.
1 code implementation • 27 May 2022 • Lukas Edman, Antonio Toral, Gertjan van Noord
Character-based representations have important advantages over subword-based ones for morphologically rich languages.
1 code implementation • 24 May 2022 • Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord, Sebastian Ruder
Massively multilingual models are promising for transfer learning across tasks and languages.
no code implementations • ICON 2021 • Lukas Edman, Antonio Toral, Gertjan van Noord
This paper investigates very low resource language model pretraining, when less than 100 thousand sentences are available.
1 code implementation • 24 Sep 2021 • Lukas Edman, Ahmet Üstün, Antonio Toral, Gertjan van Noord
Lastly, we experiment with the order in which offline and online back-translation are used to train an unsupervised system, finding that using online back-translation first works better for DE$\rightarrow$DSB by 2. 76 BLEU.
no code implementations • LREC 2020 • Ant{\'o}nio Branco, Nicoletta Calzolari, Piek Vossen, Gertjan van Noord, Dieter van Uytvanck, Jo{\~a}o Silva, Lu{\'\i}s Gomes, Andr{\'e} Moreira, Willem Elbers
n this paper, we introduce a new type of shared task {---} which is collaborative rather than competitive {---} designed to support and fosterthe reproduction of research results.
1 code implementation • EMNLP 2020 • Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord
The resulting parser, UDapter, outperforms strong monolingual and multilingual baselines on the majority of both high-resource and low-resource (zero-shot) languages, showing the success of the proposed adaptation approach.
2 code implementations • 19 Dec 2019 • Wietse de Vries, Andreas van Cranenburgh, Arianna Bisazza, Tommaso Caselli, Gertjan van Noord, Malvina Nissim
The transformer-based pre-trained language model BERT has helped to improve state-of-the-art performance on many natural language processing (NLP) tasks.
Ranked #3 on Sentiment Analysis on DBRD
no code implementations • RANLP 2019 • Ahmet {\"U}st{\"u}n, Gosse Bouma, Gertjan van Noord
Cross-lingual word embedding models learn a shared vector space for two or more languages so that words with similar meaning are represented by similar vectors regardless of their language.
no code implementations • WS 2019 • Ahmet {\"U}st{\"u}n, Rob van der Goot, Gosse Bouma, Gertjan van Noord
This paper describes our submission to SIGMORPHON 2019 Task 2: Morphological analysis and lemmatization in context.
no code implementations • CL 2018 • Martijn Wieling, Josine Rawee, Gertjan van Noord
For a selection of ten papers, we attempted to reproduce the results using the provided data and code.
1 code implementation • EMNLP 2018 • Rob van der Goot, Gertjan van Noord
Recently introduced neural network parsers allow for new approaches to circumvent data sparsity issues by modeling character level information and by exploiting raw data in a semi-supervised setting.
2 code implementations • 10 Oct 2017 • Rob van der Goot, Gertjan van Noord
We show that MoNoise beats the state-of-the-art on different normalization benchmarks for English and Dutch, which all define the task of normalization slightly different.
Ranked #1 on Lexical Normalization on LexNorm
no code implementations • WS 2017 • Artur Kulmizev, Bo Blankers, Johannes Bjerva, Malvina Nissim, Gertjan van Noord, Barbara Plank, Martijn Wieling
In this paper, we explore the performance of a linear SVM trained on language independent character features for the NLI Shared Task 2017.
no code implementations • ACL 2017 • Rob van der Goot, Gertjan van Noord
This work explores different approaches of using normalization for parser adaptation.
no code implementations • WS 2016 • Rosa Gaudio, Gorka Labaka, Eneko Agirre, Petya Osenova, Kiril Simov, Martin Popel, Dieke Oele, Gertjan van Noord, Lu{\'\i}s Gomes, Jo{\~a}o Ant{\'o}nio Rodrigues, Steven Neale, Jo{\~a}o Silva, Andreia Querido, Nuno Rendeiro, Ant{\'o}nio Branco
1 code implementation • NAACL 2016 • Simon Šuster, Ivan Titov, Gertjan van Noord
We present an approach to learning multi-sense word embeddings relying both on monolingual and bilingual information.
1 code implementation • 31 Aug 2015 • Simon Šuster, Gertjan van Noord, Ivan Titov
Word representations induced from models with discrete latent variables (e. g.\ HMMs) have been shown to be beneficial in many NLP applications.
no code implementations • LREC 2014 • Angelina Ivanova, Gertjan van Noord
In the second experiment it is tested for the ability to score the parse tree of the correct sentence higher than the constituency tree of the original version of the sentence containing grammatical error.