Search Results for author: Fatih Beyhan

Found 7 papers, 1 papers with code

A Turkish Hate Speech Dataset and Detection System

no code implementations LREC 2022 Fatih Beyhan, Buse Çarık, İnanç Arın, Ayşecan Terzioğlu, Berrin Yanikoglu, Reyyan Yeniterzi

We present a machine learning system for automatic detection of hate speech in Turkish, along with a hate speech dataset consisting of tweets collected in two separate domains.

Binary Classification Hate Speech Detection

SU-NLP at CASE 2021 Task 1: Protest News Detection for English

no code implementations ACL (CASE) 2021 Furkan Çelik, Tuğberk Dalkılıç, Fatih Beyhan, Reyyan Yeniterzi

This paper summarizes our group’s efforts in the multilingual protest news detection shared task, which is organized as a part of the Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE) Workshop.

Event Extraction

WordNet and Wikipedia Connection in Turkish WordNet KeNet

no code implementations gwll (LREC) 2022 Merve Doğan, Ceren Oksal, Arife Betül Yenice, Fatih Beyhan, Reyyan Yeniterzi, Olcay Taner Yildiz

This paper aims to present WordNet and Wikipedia connection by linking synsets from Turkish WordNet KeNet with Wikipedia and thus, provide a better machine-readable dictionary to create an NLP model with rich data.

BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

no code implementations12 Feb 2024 Mateusz Łajszczak, Guillermo Cámbara, Yang Li, Fatih Beyhan, Arent van Korlaar, Fan Yang, Arnaud Joly, Álvaro Martín-Cortinas, Ammar Abbas, Adam Michalski, Alexis Moinet, Sri Karlapati, Ewa Muszyńska, Haohan Guo, Bartosz Putrycz, Soledad López Gambino, Kayeon Yoo, Elena Sokolova, Thomas Drugman

Echoing the widely-reported "emergent abilities" of large language models when trained on increasing volume of data, we show that BASE TTS variants built with 10K+ hours and 500M+ parameters begin to demonstrate natural prosody on textually complex sentences.

Decoder Disentanglement +2

Extended Multilingual Protest News Detection -- Shared Task 1, CASE 2021 and 2022

no code implementations21 Nov 2022 Ali Hürriyetoğlu, Osman Mutlu, Fırat Duruşan, Onur Uca, Alaeddin Selçuk Gürel, Benjamin Radford, Yaoyao Dai, Hansi Hettiarachchi, Niklas Stoehr, Tadashi Nomoto, Milena Slavcheva, Francielle Vargas, Aaqib Javid, Fatih Beyhan, Erdem Yörük

The CASE 2022 extension consists of expanding the test data with more data in previously available languages, namely, English, Hindi, Portuguese, and Spanish, and adding new test data in Mandarin, Turkish, and Urdu for Sub-task 1, document classification.

Document Classification Event Detection +3

Event Coreference Resolution for Contentious Politics Events

1 code implementation18 Mar 2022 Ali Hürriyetoğlu, Osman Mutlu, Fatih Beyhan, Fırat Duruşan, Ali Safaya, Reyyan Yeniterzi, Erdem Yörük

We propose a dataset for event coreference resolution, which is based on random samples drawn from multiple sources, languages, and countries.

coreference-resolution Event Coreference Resolution

Cannot find the paper you are looking for? You can Submit a new open access paper.