no code implementations • LREC 2016 • Ayman Alghamdi, Eric Atwell, Claire Brierley
This paper aims to implement what is referred to as the collocation of the Arabic keywords approach for extracting formulaic sequences (FSs) in the form of high frequency but semantically regular formulas that are not restricted to any syntactic construction or semantic domain.
no code implementations • LREC 2016 • Latifa Al-Sulaiti, Noorhan Abbas, Claire Brierley, Eric Atwell, Ayman Alghamdi
Inspired by the Oxford Children{'}s Corpus, we have developed a prototype corpus of Arabic texts written and/or selected for children.
no code implementations • LREC 2014 • Claire Brierley, Majdi Sawalha, Eric Atwell
In this paper, we focus on the prosodic effect of qalqalah or {``}vibration{''} applied to a subset of Arabic consonants under certain constraints during correct Qur{'}anic recitation or ta{\c{C}}{\S}w{\=\i}d, using our Boundary-Annotated QurÂ’an dataset of 77430 words (Brierley et al 2012; Sawalha et al 2014).
no code implementations • LREC 2012 • Majdi Sawalha, Claire Brierley, Eric Atwell
We train and test two probabilistic taggers for Arabic phrase break prediction on a purpose-built, gold standard, boundary-annotated and PoS-tagged Qur'an corpus of 77430 words and 8230 sentences.
no code implementations • LREC 2012 • Claire Brierley, Majdi Sawalha, Eric Atwell
We take a novel approach to phrase break prediction for Arabic, deriving our prosodic annotation scheme from Tajw{\=\i}d (recitation) mark-up in the Qur'an which we then interpret as additional text-based data for computational analysis.