no code implementations • LREC (BUCC) 2022 • Silvia Severini, Viktor Hangya, Masoud Jalili Sabet, Alexander Fraser, Hinrich Schütze
The two approaches we find most effective are: 1) using identical words as seed lexicons (which unsupervised approaches incorrectly assume are not available for orthographically distinct language pairs) and 2) combining such lexicons with pairs extracted by matching romanized versions of words with an edit distance threshold.
1 code implementation • 20 May 2023 • Ayyoob ImaniGooghari, Peiqin Lin, Amir Hossein Kargaran, Silvia Severini, Masoud Jalili Sabet, Nora Kassner, Chunlan Ma, Helmut Schmid, André F. T. Martins, François Yvon, Hinrich Schütze
The NLP community has mainly focused on scaling Large Language Models (LLMs) vertically, i. e., making them better for about 100 languages.
1 code implementation • 18 Oct 2022 • Ayyoob Imani, Silvia Severini, Masoud Jalili Sabet, François Yvon, Hinrich Schütze
An established method for training a POS tagger in such a scenario is to create a labeled training set by transferring from high-resource languages.
no code implementations • 31 May 2022 • Silvia Severini, Viktor Hangya, Masoud Jalili Sabet, Alexander Fraser, Hinrich Schütze
The two approaches we find most effective are: 1) using identical words as seed lexicons (which unsupervised approaches incorrectly assume are not available for orthographically distinct language pairs) and 2) combining such lexicons with pairs extracted by matching romanized versions of words with an edit distance threshold.
1 code implementation • ACL 2022 • Leonie Weissweiler, Valentin Hofmann, Masoud Jalili Sabet, Hinrich Schütze
We introduce CaMEL (Case Marker Extraction without Labels), a novel and challenging task in computational morphology that is especially relevant for low-resource languages.
no code implementations • Findings (ACL) 2022 • Ayyoob Imani, Lütfi Kerem Şenel, Masoud Jalili Sabet, François Yvon, Hinrich Schütze
First, we create a multiparallel word alignment graph, joining all bilingual word alignment pairs in one graph.
1 code implementation • EMNLP 2021 • Ayyoob Imani, Masoud Jalili Sabet, Lütfi Kerem Şenel, Philipp Dufter, François Yvon, Hinrich Schütze
With the advent of end-to-end deep learning approaches in machine translation, interest in word alignments initially decreased; however, they have again become a focus of research more recently.
no code implementations • ACL 2021 • Ayyoob Imani, Masoud Jalili Sabet, Philipp Dufter, Michael Cysouw, Hinrich Schütze
With more than 7000 languages worldwide, multilingual natural language processing (NLP) is essential both from an academic and commercial perspective.
no code implementations • 21 Dec 2020 • Ehsaneddin Asgari, Masoud Jalili Sabet, Philipp Dufter, Christopher Ringlstetter, Hinrich Schütze
This method's hypothesis is that the aggregation of different granularities of text for certain language pairs can help word-level alignment.
3 code implementations • Findings of the Association for Computational Linguistics 2020 • Masoud Jalili Sabet, Philipp Dufter, François Yvon, Hinrich Schütze
We find that alignments created from embeddings are superior for four and comparable for two language pairs compared to those produced by traditional statistical aligners, even with abundant parallel data; e. g., contextualized embeddings achieve a word alignment F1 for English-German that is 5 percentage points higher than eflomal, a high-quality statistical aligner, trained on 100k parallel sentences.
no code implementations • ECIR 2019 • Erfan Ghadery, Sajad Movahedi, Masoud Jalili Sabet, Heshaam Faili, Azadeh Shakery
For a given sentence, our proposed method performs ACD based on two hypotheses: First, a category should be assigned to a sentence if there is a high semantic similarity between the sentence and a set of representative words of that category.
Aspect-Based Sentiment Analysis (ABSA)
Aspect Category Detection
+4
no code implementations • 31 Oct 2018 • Nina Poerner, Masoud Jalili Sabet, Benjamin Roth, Hinrich Schütze
Count-based word alignment methods, such as the IBM models or fast-align, struggle on very small parallel corpora.
no code implementations • COLING 2016 • Masoud Jalili Sabet, Heshaam Faili, Gholamreza Haffari
We address the problem of inducing word alignment for language pairs by developing an unsupervised model with the capability of getting applied to other generative alignment models.
no code implementations • COLING 2016 • Javid Dadashkarimi, Masoud Jalili Sabet, Azadeh Shakery
To this end, first we build a query-generated training data using pseudo-relevant documents to the query and all translation candidates.