no code implementations • EACL (WANLP) 2021 • Abdullah Alsaleh, Eric Atwell, Abdulrahman Altahhan
Bidirectional Encoder Representations from Transformers (BERT) has gained popularity in recent years producing state-of-the-art performances across Natural Language Processing tasks.
no code implementations • ICON 2020 • Taghreed Tarmom, Eric Atwell, Mohammad Alsalka
In this paper we explore the use of Prediction by partial matching (PPM) compression based to segment Hadith into its two main components (Isnad and Matan).
no code implementations • ICON 2021 • Menwa Alshammeri, Eric Atwell, Mohammad Alsalka
The Quran, as a significant religious text, bears important spiritual and linguistic values.
no code implementations • OSACT (LREC) 2022 • Abdullah Alsaleh, Saud Althabiti, Ibtisam Alshammari, Sarah Alnefaie, Sanaa Alowaidi, Alaa Alsaqer, Eric Atwell, Abdulrahman Altahhan, Mohammad Alsalka
Question answering is a specialized area in the field of NLP that aims to extract the answer to a user question from a given text.
no code implementations • ICON 2020 • Mashael AlAmr, Eric Atwell
This is a pilot study that aims to explore the potential of using WEKA in forensic authorship analysis.
no code implementations • LREC 2022 • Shatha Altammami, Eric Atwell
Transformer-based models showed near-perfect results on several downstream tasks.
no code implementations • 25 Jan 2024 • Saud Althabiti, Mohammad Ammar Alsalka, Eric Atwell
This paper introduces Ta'keed, an explainable Arabic automatic fact-checking system.
no code implementations • LREC 2020 • Shatha Altammami, Eric Atwell, Ammar Alsalka
This article describes the process of gathering and constructing a bilingual parallel corpus of Islamic Hadith, which is the set of narratives reporting different aspects of the prophet Muhammad{'}s life.
no code implementations • WS 2016 • Areej Alshutayri, Eric Atwell, Abdulrahman Alosaimy, James Dickins, Michael Ingleby, Janet Watson
This paper describes an Arabic dialect identification system which we developed for the Discriminating Similar Languages (DSL) 2016 shared task.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 24 May 2016 • Abdelaziz Lakhfif, Mohammed T. Laskri, Eric Atwell
In this paper, we present an ongoing effort in lexical semantic analysis and annotation of Modern Standard Arabic (MSA) text, a semi automatic annotation tool concerned with the morphologic, syntactic, and semantic levels of description.
no code implementations • LREC 2016 • Latifa Al-Sulaiti, Noorhan Abbas, Claire Brierley, Eric Atwell, Ayman Alghamdi
Inspired by the Oxford Children{'}s Corpus, we have developed a prototype corpus of Arabic texts written and/or selected for children.
no code implementations • LREC 2016 • Ayman Alghamdi, Eric Atwell, Claire Brierley
This paper aims to implement what is referred to as the collocation of the Arabic keywords approach for extracting formulaic sequences (FSs) in the form of high frequency but semantically regular formulas that are not restricted to any syntactic construction or semantic domain.
no code implementations • LREC 2014 • Claire Brierley, Majdi Sawalha, Eric Atwell
In this paper, we focus on the prosodic effect of qalqalah or {``}vibration{''} applied to a subset of Arabic consonants under certain constraints during correct Qur{'}anic recitation or ta{\c{C}}{\S}w{\=\i}d, using our Boundary-Annotated QurÂ’an dataset of 77430 words (Brierley et al 2012; Sawalha et al 2014).
no code implementations • 18 Feb 2014 • Samuel Danso, Eric Atwell, Owen Johnson
We report on a comparative study of the processes involved in Text Classification applied to classifying Cause of Death: feature value representation; machine learning classification algorithms; and feature reduction strategies in order to identify the suitable approaches applicable to the classification of Verbal Autopsy text.
no code implementations • LREC 2012 • Kais Dukes, Eric Atwell
We provide a description of the underlying software system that has been used to develop the corpus annotations.
no code implementations • LREC 2012 • Majdi Sawalha, Claire Brierley, Eric Atwell
We train and test two probabilistic taggers for Arabic phrase break prediction on a purpose-built, gold standard, boundary-annotated and PoS-tagged Qur'an corpus of 77430 words and 8230 sentences.
no code implementations • LREC 2012 • Abdul-Baquee Sharaf, Eric Atwell
This paper presents a large corpus created from the original Quranic text, where semantically similar or related verses are linked together.
no code implementations • LREC 2012 • Abdul-Baquee Sharaf, Eric Atwell
These antecedents are maintained as an ontological list of concepts, which have proved helpful for information retrieval tasks.
no code implementations • LREC 2012 • Claire Brierley, Majdi Sawalha, Eric Atwell
We take a novel approach to phrase break prediction for Arabic, deriving our prosodic annotation scheme from Tajw{\=\i}d (recitation) mark-up in the Qur'an which we then interpret as additional text-based data for computational analysis.