no code implementations • LREC (BUCC) 2022 • Silvia Severini, Viktor Hangya, Masoud Jalili Sabet, Alexander Fraser, Hinrich Schütze
The two approaches we find most effective are: 1) using identical words as seed lexicons (which unsupervised approaches incorrectly assume are not available for orthographically distinct language pairs) and 2) combining such lexicons with pairs extracted by matching romanized versions of words with an edit distance threshold.
no code implementations • 21 Nov 2023 • Viktor Hangya, Silvia Severini, Radoslav Ralev, Alexander Fraser, Hinrich Schütze
In this paper, we propose to build multilingual word embeddings (MWEs) via a novel language chain-based approach, that incorporates intermediate related languages to bridge the gap between the distant source and target.
1 code implementation • 20 May 2023 • Ayyoob Imani, Peiqin Lin, Amir Hossein Kargaran, Silvia Severini, Masoud Jalili Sabet, Nora Kassner, Chunlan Ma, Helmut Schmid, André F. T. Martins, François Yvon, Hinrich Schütze
The NLP community has mainly focused on scaling Large Language Models (LLMs) vertically, i. e., making them better for about 100 languages.
1 code implementation • 18 Oct 2022 • Ayyoob Imani, Silvia Severini, Masoud Jalili Sabet, François Yvon, Hinrich Schütze
An established method for training a POS tagger in such a scenario is to create a labeled training set by transferring from high-resource languages.
1 code implementation • 12 Oct 2022 • Abdullatif Köksal, Silvia Severini, Hinrich Schütze
Word alignments are essential for a variety of NLP tasks.
no code implementations • 31 May 2022 • Silvia Severini, Viktor Hangya, Masoud Jalili Sabet, Alexander Fraser, Hinrich Schütze
The two approaches we find most effective are: 1) using identical words as seed lexicons (which unsupervised approaches incorrectly assume are not available for orthographically distinct language pairs) and 2) combining such lexicons with pairs extracted by matching romanized versions of words with an edit distance threshold.
no code implementations • LREC 2022 • Silvia Severini, Ayyoob Imani, Philipp Dufter, Hinrich Schütze
Prior work on extracting MNE datasets from parallel corpora required resources such as large monolingual corpora or word aligners that are unavailable or perform poorly for underresourced languages.
1 code implementation • 6 Apr 2021 • Ahmed Elnaggar, Wei Ding, Llion Jones, Tom Gibbs, Tamas Feher, Christoph Angerer, Silvia Severini, Florian Matthes, Burkhard Rost
Simultaneously, the transformer model, especially its combination with transfer learning, has been proven to be a powerful technique for natural language processing tasks.
no code implementations • COLING 2020 • Silvia Severini, Viktor Hangya, Alexander Fraser, Hinrich Sch{\"u}tze
In this paper, we enrich BWE-based BDI with transliteration information by using Bilingual Orthography Embeddings (BOEs).
no code implementations • LREC 2020 • Silvia Severini, Viktor Hangya, Alex Fraser, er, Hinrich Sch{\"u}tze
We participate in both the open and closed tracks of the shared task and we show improved results of our method compared to simple vector similarity based approaches.