Search Results for author: Masoud Jalili Sabet

Found 16 papers, 6 papers with code

SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings

3 code implementations Findings of the Association for Computational Linguistics 2020 Masoud Jalili Sabet, Philipp Dufter, François Yvon, Hinrich Schütze

We find that alignments created from embeddings are superior for four and comparable for two language pairs compared to those produced by traditional statistical aligners, even with abundant parallel data; e. g., contextualized embeddings achieve a word alignment F1 for English-German that is 5 percentage points higher than eflomal, a high-quality statistical aligner, trained on 100k parallel sentences.

Machine Translation Multilingual Word Embeddings +3

Graph Algorithms for Multiparallel Word Alignment

1 code implementation EMNLP 2021 Ayyoob Imani, Masoud Jalili Sabet, Lütfi Kerem Şenel, Philipp Dufter, François Yvon, Hinrich Schütze

With the advent of end-to-end deep learning approaches in machine translation, interest in word alignments initially decreased; however, they have again become a focus of research more recently.

Link Prediction Machine Translation +3

CaMEL: Case Marker Extraction without Labels

1 code implementation ACL 2022 Leonie Weissweiler, Valentin Hofmann, Masoud Jalili Sabet, Hinrich Schütze

We introduce CaMEL (Case Marker Extraction without Labels), a novel and challenging task in computational morphology that is especially relevant for low-resource languages.

Graph-Based Multilingual Label Propagation for Low-Resource Part-of-Speech Tagging

1 code implementation18 Oct 2022 Ayyoob Imani, Silvia Severini, Masoud Jalili Sabet, François Yvon, Hinrich Schütze

An established method for training a POS tagger in such a scenario is to create a labeled training set by transferring from high-resource languages.

Part-Of-Speech Tagging POS +3

Improving Word Alignment of Rare Words with Word Embeddings

no code implementations COLING 2016 Masoud Jalili Sabet, Heshaam Faili, Gholamreza Haffari

We address the problem of inducing word alignment for language pairs by developing an unsupervised model with the capability of getting applied to other generative alignment models.

Machine Translation Sentence +2

Subword Sampling for Low Resource Word Alignment

no code implementations21 Dec 2020 Ehsaneddin Asgari, Masoud Jalili Sabet, Philipp Dufter, Christopher Ringlstetter, Hinrich Schütze

This method's hypothesis is that the aggregation of different granularities of text for certain language pairs can help word-level alignment.

Bayesian Optimization Machine Translation +1

ParCourE: A Parallel Corpus Explorer for a Massively Multilingual Corpus

no code implementations ACL 2021 Ayyoob Imani, Masoud Jalili Sabet, Philipp Dufter, Michael Cysouw, Hinrich Schütze

With more than 7000 languages worldwide, multilingual natural language processing (NLP) is essential both from an academic and commercial perspective.

Multilingual NLP Transfer Learning

Don't Forget Cheap Training Signals Before Building Unsupervised Bilingual Word Embeddings

no code implementations31 May 2022 Silvia Severini, Viktor Hangya, Masoud Jalili Sabet, Alexander Fraser, Hinrich Schütze

The two approaches we find most effective are: 1) using identical words as seed lexicons (which unsupervised approaches incorrectly assume are not available for orthographically distinct language pairs) and 2) combining such lexicons with pairs extracted by matching romanized versions of words with an edit distance threshold.

Cross-Lingual Transfer Word Embeddings

Don’t Forget Cheap Training Signals Before Building Unsupervised Bilingual Word Embeddings

no code implementations LREC (BUCC) 2022 Silvia Severini, Viktor Hangya, Masoud Jalili Sabet, Alexander Fraser, Hinrich Schütze

The two approaches we find most effective are: 1) using identical words as seed lexicons (which unsupervised approaches incorrectly assume are not available for orthographically distinct language pairs) and 2) combining such lexicons with pairs extracted by matching romanized versions of words with an edit distance threshold.

Cross-Lingual Transfer Word Embeddings

LICD: A Language-Independent Approach for Aspect Category Detection

no code implementations ECIR 2019 Erfan Ghadery, Sajad Movahedi, Masoud Jalili Sabet, Heshaam Faili, Azadeh Shakery

For a given sentence, our proposed method performs ACD based on two hypotheses: First, a category should be assigned to a sentence if there is a high semantic similarity between the sentence and a set of representative words of that category.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +6

Cannot find the paper you are looking for? You can Submit a new open access paper.