Search Results for author: Serge Sharoff

Found 37 papers, 3 papers with code

Applying Natural Annotation and Curriculum Learning to Named Entity Recognition for Under-Resourced Languages

1 code implementation COLING 2022 Valeriy Lobov, Alexandra Ivoylova, Serge Sharoff

In this study we test the possibility of (1) using natural annotation to build synthetic training sets from resources not initially designed for the target downstream task and (2) employing curriculum learning methods to select the most suitable examples from synthetic training sets.

Cross-Lingual Transfer Machine Translation +3

BERTology for Machine Translation: What BERT Knows about Linguistic Difficulties for Translation

no code implementations LREC 2022 Yuqian Dai, Marc de Kamps, Serge Sharoff

Pre-trained transformer-based models, such as BERT, have shown excellent performance in most natural language processing benchmark tests, but we still lack a good understanding of the linguistic knowledge of BERT in Neural Machine Translation (NMT).

Machine Translation NMT +1

BERT Goes Off-Topic: Investigating the Domain Transfer Challenge using Genre Classification

1 code implementation27 Nov 2023 Dmitri Roussinov, Serge Sharoff

While performance of many text classification tasks has been recently improved due to Pre-trained Language Models (PLMs), in this paper we show that they still suffer from a performance gap when the underlying distribution of topics changes.

Genre classification Sentiment Analysis +3

Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report Generation

no code implementations18 Nov 2023 Nurbanu Aksoy, Serge Sharoff, Selcuk Baser, Nishant Ravikumar, Alejandro F Frangi

Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images.

Semantic Similarity Semantic Textual Similarity

GATology for Linguistics: What Syntactic Dependencies It Knows

no code implementations22 May 2023 Yuqian Dai, Serge Sharoff, Marc de Kamps

Moreover, GAT is more competitive in training speed and syntactic dependency prediction than MT-B, which may reveal a better incorporation of modeling explicit syntactic knowledge and the possibility of combining GAT and BERT in the MT tasks.

Graph Attention Machine Translation

Syntactic Knowledge via Graph Attention with BERT in Machine Translation

no code implementations22 May 2023 Yuqian Dai, Serge Sharoff, Marc de Kamps

Although the Transformer model can effectively acquire context features via a self-attention mechanism, deeper syntactic knowledge is still not effectively modeled.

Graph Attention Machine Translation +2

Estimating Confidence of Predictions of Individual Classifiers and Their Ensembles for the Genre Classification Task

no code implementations15 Jun 2022 Mikhail Lepekhin, Serge Sharoff

We can evaluate robustness via the confidence gap between the correctly classified texts and the misclassified ones on a labeled test corpus, higher gaps make it easier to improve our confidence that our classifier made the right decision.

Genre classification text-classification +1

Towards Arabic Sentence Simplification via Classification and Generative Approaches

no code implementations20 Apr 2022 Nouran Khallaf, Serge Sharoff

This paper presents an attempt to build a Modern Standard Arabic (MSA) sentence-level simplification system.

Classification Lexical Simplification +2

Experiments with adversarial attacks on text genres

no code implementations5 Jul 2021 Mikhail Lepekhin, Serge Sharoff

Neural models based on pre-trained transformers, such as BERT or XLM-RoBERTa, demonstrate SOTA results in many NLP tasks, including non-topical classification, such as genre identification.

Automatic Difficulty Classification of Arabic Sentences

no code implementations EACL (WANLP) 2021 Nouran Khallaf, Serge Sharoff

In this paper, we present a Modern Standard Arabic (MSA) Sentence difficulty classifier, which predicts the difficulty of sentences for language learners using either the CEFR proficiency levels or the binary classification as simple or complex.

Binary Classification Classification +7

Overview of the Fourth BUCC Shared Task: Bilingual Dictionary Induction from Comparable Corpora

no code implementations LREC 2020 Reinhard Rapp, Pierre Zweigenbaum, Serge Sharoff

The shared task of the 13th Workshop on Building and Using Comparable Corpora was devoted to the induction of bilingual dictionaries from comparable rather than parallel corpora.

Recognizing Semantic Relations by Combining Transformers and Fully Connected Models

no code implementations LREC 2020 Dmitri Roussinov, Serge Sharoff, Nadezhda Puchnina

Current approaches to automatically telling if a relation exists between two given concepts X and Y can be grouped into two types: 1) those modeling word-paths connecting X and Y in text and 2) those modeling distributional properties of X and Y separately, not necessary in the proximity to each other.

Language Modelling Relation

Know thy corpus! Robust methods for digital curation of Web corpora

1 code implementation LREC 2020 Serge Sharoff

This paper proposes a novel framework for digital curation of Web corpora in order to provide robust estimation of their parameters, such as their composition and the lexicon.

Genre classification Topic Models

Towards Functionally Similar Corpus Resources for Translation

no code implementations RANLP 2019 Maria Kunilovskaya, Serge Sharoff

We exploit a text-external approach, based on a set of Functional Text Dimensions to model text functions, so that each text can be represented as a vector in a multidimensional space of text functions.

Translation

Overview of the Second BUCC Shared Task: Spotting Parallel Sentences in Comparable Corpora

no code implementations WS 2017 Pierre Zweigenbaum, Serge Sharoff, Reinhard Rapp

We examined manually a small sample of the false negative sentence pairs for the most precise French-English runs and estimated the number of parallel sentence pairs not yet in the provided gold standard.

Machine Translation Sentence

Toward Pan-Slavic NLP: Some Experiments with Language Adaptation

no code implementations WS 2017 Serge Sharoff

In this talk I will discuss a general approach, which can be called Language Adaptation, similarly to Domain Adaptation.

Domain Adaptation Language Modelling +7

MoBiL: A Hybrid Feature Set for Automatic Human Translation Quality Assessment

no code implementations LREC 2016 Yu Yuan, Serge Sharoff, Bogdan Babych

We compare MoBiL with the QuEst baseline set by using them in classifiers trained with support vector machine and relevance vector machine learning algorithms on the same data set.

feature selection Language Modelling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.