1 code implementation • 2 May 2024 • Samee Arif, Sualeha Farid, Awais Athar, Agha Ali Raza
This paper introduces UQA, a novel dataset for question answering and text comprehension in Urdu, a low-resource language with over 70 million native speakers.
no code implementations • LREC 2020 • Namoos Hayat Qasmi, Haris Bin Zia, Awais Athar, Agha Ali Raza
Being a low-resource language in terms of standard linguistic resources, recent text simplification approaches that rely on manually crafted simplified corpora or lexicons such as WordNet are not applicable to Urdu.
1 code implementation • COLING 2018 • Haris Bin Zia, Agha Ali Raza, Awais Athar
State-of-the-art Natural Language Processing algorithms rely heavily on efficient word segmentation.
no code implementations • LREC 2018 • Haris Bin Zia, Agha Ali Raza, Awais Athar
The tool predicts the pronunciation of words using a LSTM-based model trained on a handcrafted expert lexicon of around 39, 000 words and shows an accuracy of 64% upon internal evaluation.