Search Results for author: Özlem Çetinoğlu

Found 11 papers, 3 papers with code

A Language-aware Approach to Code-switched Morphological Tagging

no code implementations NAACL (CALCS) 2021 Şaziye Betül Özateş, Özlem Çetinoğlu

Morphological tagging of code-switching (CS) data becomes more challenging especially when language pairs composing the CS data have different morphological representations.

Morphological Tagging

Anonymising the SAGT Speech Corpus and Treebank

no code implementations LREC 2022 Özlem Çetinoğlu, Antje Schweitzer

In this paper, we describe the anonymisation process of a Turkish-German code-switching corpus, namely SAGT, which consists of speech data and a treebank that is built on its transcripts.

Assessing Gender Bias in Wikipedia: Inequalities in Article Titles

1 code implementation ACL (GeBNLP) 2021 Agnieszka Falenska, Özlem Çetinoğlu

Potential gender biases existing in Wikipedia’s content can contribute to biased behaviors in a variety of downstream NLP systems.

Improving Code-Switching Dependency Parsing with Semi-Supervised Auxiliary Tasks

1 code implementation Findings (NAACL) 2022 Şaziye Özateş, Arzucan Özgür, Tunga Gungor, Özlem Çetinoğlu

Code-switching dependency parsing stands as a challenging task due to both the scarcity of necessary resources and the structural difficulties embedded in code-switched languages.

Dependency Parsing XLM-R

Resources for Turkish Natural Language Processing: A critical survey

no code implementations11 Apr 2022 Çağrı Çöltekin, A. Seza Doğruöz, Özlem Çetinoğlu

This paper presents a comprehensive survey of corpora and lexical resources available for Turkish.


Treebanking User-Generated Content: a UD Based Overview of Guidelines, Corpora and Unified Recommendations

no code implementations3 Nov 2020 Manuela Sanguinetti, Lauren Cassidy, Cristina Bosco, Özlem Çetinoğlu, Alessandra Teresa Cignarella, Teresa Lynn, Ines Rehbein, Josef Ruppenhofer, Djamé Seddah, Amir Zeldes

This article presents a discussion on the main linguistic phenomena which cause difficulties in the analysis of user-generated texts found on the web and in social media, and proposes a set of annotation guidelines for their treatment within the Universal Dependencies (UD) framework of syntactic analysis.

Lexical Normalization for Code-switched Data and its Effect on POS-tagging

no code implementations1 Jun 2020 Rob van der Goot, Özlem Çetinoğlu

Lexical normalization, the translation of non-canonical data to standard language, has shown to improve the performance of manynatural language processing tasks on social media.

Language Identification Lexical Normalization +3

Subword-Level Language Identification for Intra-Word Code-Switching

no code implementations NAACL 2019 Manuel Mager, Özlem Çetinoğlu, Katharina Kann

Language identification for code-switching (CS), the phenomenon of alternating between two or more languages in conversations, has traditionally been approached under the assumption of a single language per token.

Language Identification

Challenges of Computational Processing of Code-Switching

no code implementations WS 2016 Özlem Çetinoğlu, Sarah Schulz, Ngoc Thang Vu

This paper addresses challenges of Natural Language Processing (NLP) on non-canonical multilingual data in which two or more languages are mixed.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +7

Cannot find the paper you are looking for? You can Submit a new open access paper.