Search Results for author: Hassan Sajjad

Found 74 papers, 17 papers with code

Implicit representations of event properties within contextual language models: Searching for “causativity neurons”

1 code implementation • IWCS (ACL) 2021 • Esther Seyffarth, Younes Samih, Laura Kallmeyer, Hassan Sajjad

This paper addresses the question to which extent neural contextual language models such as BERT implicitly represent complex semantic properties.

Sentence

Paper
Code

Findings of the WMT 2020 Shared Task on Machine Translation Robustness

no code implementations • WMT (EMNLP) 2020 • Lucia Specia, Zhenhao Li, Juan Pino, Vishrav Chaudhary, Francisco Guzmán, Graham Neubig, Nadir Durrani, Yonatan Belinkov, Philipp Koehn, Hassan Sajjad, Paul Michel, Xian Li

We report the findings of the second edition of the shared task on improving robustness in Machine Translation (MT).

Machine Translation Translation

Paper
Add Code

QCRI’s Machine Translation Systems for IWSLT’16

no code implementations • IWSLT 2016 • Nadir Durrani, Fahim Dalvi, Hassan Sajjad, Stephan Vogel

This paper describes QCRI’s machine translation systems for the IWSLT 2016 evaluation campaign.

Domain Adaptation Language Modelling +3

Paper
Add Code

Quantifying the Capabilities of LLMs across Scale and Precision

no code implementations • 6 May 2024 • Sher Badshah, Hassan Sajjad

Scale is often attributed as one of the factors that cause an increase in the performance of LLMs, resulting in models with billion and trillion parameters.

Hallucination Misinformation +2

Paper
Add Code

VISLA Benchmark: Evaluating Embedding Sensitivity to Semantic and Lexical Alterations

no code implementations • 25 Apr 2024 • Sri Harsha Dumpala, Aman Jaiswal, Chandramouli Sastry, Evangelos Milios, Sageev Oore, Hassan Sajjad

This paper introduces the VISLA (Variance and Invariance to Semantic and Lexical Alterations) benchmark, designed to evaluate the semantic and lexical understanding of language models.

Text Retrieval

Paper
Add Code

Latent Concept-based Explanation of NLP Models

no code implementations • 18 Apr 2024 • Xuemin Yu, Fahim Dalvi, Nadir Durrani, Hassan Sajjad

Therefore, given a word in context, the latent space derived from our training process reflects a specific facet of that word.

Paper
Add Code

Data-centric Prediction Explanation via Kernelized Stein Discrepancy

no code implementations • 22 Mar 2024 • Mahtab Sarvmaili, Hassan Sajjad, Ga Wu

Existing example-based prediction explanation methods often bridge test and training data points through the model's parameters or latent representations.

Paper
Add Code

Immunization against harmful fine-tuning attacks

no code implementations • 26 Feb 2024 • Domenic Rosati, Jan Wehner, Kai Williams, Łukasz Bartoszcze, Jan Batzner, Hassan Sajjad, Frank Rudzicz

Approaches to aligning large language models (LLMs) with human values has focused on correcting misalignment that emerges from pretraining.

Paper
Add Code

Long-form evaluation of model editing

1 code implementation • 14 Feb 2024 • Domenic Rosati, Robie Gonzales, Jinkun Chen, Xuemin Yu, Melis Erkan, Yahya Kayani, Satya Deepika Chavatapalli, Frank Rudzicz, Hassan Sajjad

Evaluations of model editing currently only use the `next few token' completions after a prompt.

Model Editing Text Generation

Paper
Code

Multilingual Nonce Dependency Treebanks: Understanding how LLMs represent and process syntactic structure

no code implementations • 13 Nov 2023 • David Arps, Laura Kallmeyer, Younes Samih, Hassan Sajjad

We replicate the findings of M\"uller-Eberstein et al. (2022) on nonce test data and show that the performance declines on both MLMs and ALMs wrt.

Paper
Add Code

NeuroX Library for Neuron Analysis of Deep NLP Models

1 code implementation • 26 May 2023 • Fahim Dalvi, Hassan Sajjad, Nadir Durrani

The Python toolkit is available at https://www. github. com/fdalvi/NeuroX.

Domain Adaptation

Paper
Code

NxPlain: Web-based Tool for Discovery of Latent Concepts

no code implementations • 6 Mar 2023 • Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Tamim Jaban, Musab Husaini, Ummar Abbas

NxPlain discovers latent concepts learned in a deep NLP model, provides an interpretation of the knowledge learned in the model, and explains its predictions based on the used concepts.

Fairness Sentence

Paper
Add Code

ConceptX: A Framework for Latent Concept Analysis

no code implementations • 12 Nov 2022 • Firoj Alam, Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Abdul Rafae Khan, Jia Xu

We use an unsupervised method to discover concepts learned in these models and enable a graphical interface for humans to generate explanations for the concepts.

Paper
Add Code

Impact of Adversarial Training on Robustness and Generalizability of Language Models

no code implementations • 10 Nov 2022 • Enes Altinisik, Hassan Sajjad, Husrev Taha Sencar, Safa Messaoud, Sanjay Chawla

Specifically, we study the effect of pre-training data augmentation as well as training time input perturbations vs. embedding space perturbations on the robustness and generalization of transformer-based language models.

Data Augmentation

Paper
Add Code

On the Transformation of Latent Space in Fine-Tuned NLP Models

no code implementations • 23 Oct 2022 • Nadir Durrani, Hassan Sajjad, Fahim Dalvi, Firoj Alam

We study the evolution of latent space in fine-tuned NLP models.

Paper
Add Code

Post-hoc analysis of Arabic transformer models

no code implementations • 18 Oct 2022 • Ahmed Abdelali, Nadir Durrani, Fahim Dalvi, Hassan Sajjad

Given the success of pre-trained language models, many transformer models trained on Arabic and its dialects have surfaced.

Morphological Tagging

Paper
Add Code

Discovering Salient Neurons in Deep NLP Models

no code implementations • 27 Jun 2022 • Nadir Durrani, Fahim Dalvi, Hassan Sajjad

Our data-driven, quantitative analysis illuminates interesting findings: (i) we found small subsets of neurons that can predict different linguistic tasks, ii) with neurons capturing basic lexical information (such as suffixation) localized in lower most layers, iii) while those learning complex concepts (such as syntactic role) predominantly in middle and higher layers, iii) that salient linguistic neurons are relocated from higher to lower layers during transfer learning, as the network preserve the higher layers for task specific information, iv) we found interesting differences across pre-trained models, with respect to how linguistic information is preserved within, and v) we found that concept exhibit similar neuron distribution across different languages in the multilingual transformer models.

Transfer Learning

Paper
Add Code

Analyzing Encoded Concepts in Transformer Language Models

1 code implementation • NAACL 2022 • Hassan Sajjad, Nadir Durrani, Fahim Dalvi, Firoj Alam, Abdul Rafae Khan, Jia Xu

We propose a novel framework ConceptX, to analyze how latent concepts are encoded in representations learned within pre-trained language models.

Clustering

Paper
Code

Discovering Latent Concepts Learned in BERT

no code implementations • ICLR 2022 • Fahim Dalvi, Abdul Rafae Khan, Firoj Alam, Nadir Durrani, Jia Xu, Hassan Sajjad

We address this limitation by discovering and analyzing latent concepts learned in neural network models in an unsupervised fashion and provide interpretations from the model's perspective.

Novel Concepts POS

Paper
Add Code

Probing for Constituency Structure in Neural Language Models

1 code implementation • 13 Apr 2022 • David Arps, Younes Samih, Laura Kallmeyer, Hassan Sajjad

We find that 4 pretrained transfomer LMs obtain high performance on our probing tasks even on manipulated data, suggesting that semantic and syntactic knowledge in their representations can be separated and that constituency information is in fact learned by the LM.

Paper
Code

Interpreting Arabic Transformer Models

no code implementations • 19 Jan 2022 • Ahmed Abdelali, Nadir Durrani, Fahim Dalvi, Hassan Sajjad

Arabic is a Semitic language which is widely spoken with many dialects.

Morphological Tagging POS +1

Paper
Add Code

Neuron-level Interpretation of Deep NLP Models: A Survey

no code implementations • 30 Aug 2021 • Hassan Sajjad, Nadir Durrani, Fahim Dalvi

The proliferation of deep neural networks in various domains has seen an increased need for interpretability of these models.

Domain Adaptation

Paper
Add Code

How transfer learning impacts linguistic knowledge in deep NLP models?

no code implementations • Findings (ACL) 2021 • Nadir Durrani, Hassan Sajjad, Fahim Dalvi

The pattern varies across architectures, with BERT retaining linguistic information relatively deeper in the network compared to RoBERTa and XLNet, where it is predominantly delegated to the lower layers.

Transfer Learning

Paper
Add Code

Fine-grained Interpretation and Causation Analysis in Deep NLP Models

no code implementations • NAACL 2021 • Hassan Sajjad, Narine Kokhlikyan, Fahim Dalvi, Nadir Durrani

This paper is a write-up for the tutorial on "Fine-grained Interpretation and Causation Analysis in Deep NLP Models" that we are presenting at NAACL 2021.

Domain Adaptation

Paper
Add Code

Effect of Post-processing on Contextualized Word Representations

no code implementations • COLING 2022 • Hassan Sajjad, Firoj Alam, Fahim Dalvi, Nadir Durrani

However, post-processing for contextualized embeddings is an under-studied problem.

Word Similarity

Paper
Add Code

Are We Ready for this Disaster? Towards Location Mention Recognition from Crisis Tweets

no code implementations • COLING 2020 • Reem Suwaileh, Muhammad Imran, Tamer Elsayed, Hassan Sajjad

For example, results show that, for training a location mention recognition model, Twitter-based data is preferred over general-purpose data; and crisis-related data is preferred over general-purpose Twitter data.

Management

Paper
Add Code

AraBench: Benchmarking Dialectal Arabic-English Machine Translation

no code implementations • COLING 2020 • Hassan Sajjad, Ahmed Abdelali, Nadir Durrani, Fahim Dalvi

The evaluation suite and the dialectal system are publicly available for research purposes.

Benchmarking Data Augmentation +2

Paper
Add Code

Analyzing Individual Neurons in Pre-trained Language Models

1 code implementation • EMNLP 2020 • Nadir Durrani, Hassan Sajjad, Fahim Dalvi, Yonatan Belinkov

We found small subsets of neurons to predict linguistic tasks, with lower level tasks (such as morphology) localized in fewer neurons, compared to higher level task of predicting syntax.

136

Paper
Code

Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective and a Call to Arms

1 code implementation • 15 Jul 2020 • Firoj Alam, Fahim Dalvi, Shaden Shaar, Nadir Durrani, Hamdy Mubarak, Alex Nikolov, Giovanni Da San Martino, Ahmed Abdelali, Hassan Sajjad, Kareem Darwish, Preslav Nakov

With the outbreak of the COVID-19 pandemic, people turned to social media to read and to share timely information including statistics, warnings, advice, and inspirational stories.

Misinformation

Paper
Code

Similarity Analysis of Contextual Word Representation Models

1 code implementation • ACL 2020 • John M. Wu, Yonatan Belinkov, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, James Glass

We use existing and novel similarity measures that aim to gauge the level of localization of information in the deep models, and facilitate the investigation of which design factors affect model similarity, without requiring any external linguistic annotation.

Paper
Code

Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society

2 code implementations • Findings (EMNLP) 2021 • Firoj Alam, Shaden Shaar, Fahim Dalvi, Hassan Sajjad, Alex Nikolov, Hamdy Mubarak, Giovanni Da San Martino, Ahmed Abdelali, Nadir Durrani, Kareem Darwish, Abdulaziz Al-Homaid, Wajdi Zaghouani, Tommaso Caselli, Gijs Danoe, Friso Stolk, Britt Bruntink, Preslav Nakov

With the emergence of the COVID-19 pandemic, the political and the medical aspects of disinformation merged as the problem got elevated to a whole new level to become the first global infodemic.

16k

Paper
Code

CrisisBench: Benchmarking Crisis-related Social Media Datasets for Humanitarian Information Processing

no code implementations • 14 Apr 2020 • Firoj Alam, Hassan Sajjad, Muhammad Imran, Ferda Ofli

Time-critical analysis of social media streams is important for humanitarian organizations for planing rapid response during disasters.

Benchmarking General Classification +2

Paper
Add Code

Analyzing Redundancy in Pretrained Transformer Models

1 code implementation • EMNLP 2020 • Fahim Dalvi, Hassan Sajjad, Nadir Durrani, Yonatan Belinkov

Transformer-based deep NLP models are trained using hundreds of millions of parameters, limiting their applicability in computationally constrained environments.

Transfer Learning

Paper
Code

On the Effect of Dropping Layers of Pre-trained Transformer Models

4 code implementations • 8 Apr 2020 • Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov

Transformer-based NLP models are trained using hundreds of millions or even billions of parameters, limiting their applicability in computationally constrained environments.

Knowledge Distillation Sentence +1

Paper
Code

A Clustering Framework for Lexical Normalization of Roman Urdu

1 code implementation • 31 Mar 2020 • Abdul Rafae Khan, Asim Karim, Hassan Sajjad, Faisal Kamiran, Jia Xu

Roman Urdu is an informal form of the Urdu language written in Roman script, which is widely used in South Asia for online textual content.

Clustering Lexical Normalization

Paper
Code

Compressing Large-Scale Transformer-Based Models: A Case Study on BERT

no code implementations • 27 Feb 2020 • Prakhar Ganesh, Yao Chen, Xin Lou, Mohammad Ali Khan, Yin Yang, Hassan Sajjad, Preslav Nakov, Deming Chen, Marianne Winslett

Pre-trained Transformer-based models have achieved state-of-the-art performance for various Natural Language Processing (NLP) tasks.

Model Compression

Paper
Add Code

On the Linguistic Representational Power of Neural Machine Translation Models

no code implementations • CL 2020 • Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, James Glass

(iii) Do the representations capture lexical semantics?

Machine Translation NMT +1

Paper
Add Code

A System for Diacritizing Four Varieties of Arabic

no code implementations • IJCNLP 2019 • Hamdy Mubarak, Ahmed Abdelali, Kareem Darwish, Mohamed Eldesouki, Younes Samih, Hassan Sajjad

Short vowels, aka diacritics, are more often omitted when writing different varieties of Arabic including Modern Standard Arabic (MSA), Classical Arabic (CA), and Dialectal Arabic (DA).

Feature Engineering

Paper
Add Code

Findings of the First Shared Task on Machine Translation Robustness

1 code implementation • WS 2019 • Xi-An Li, Paul Michel, Antonios Anastasopoulos, Yonatan Belinkov, Nadir Durrani, Orhan Firat, Philipp Koehn, Graham Neubig, Juan Pino, Hassan Sajjad

We share the findings of the first shared task on improving robustness of Machine Translation (MT).

Machine Translation Translation

463

Paper
Code

Highly Effective Arabic Diacritization using Sequence to Sequence Modeling

no code implementations • NAACL 2019 • Hamdy Mubarak, Ahmed Abdelali, Hassan Sajjad, Younes Samih, Kareem Darwish

Arabic text is typically written without short vowels (or diacritics).

Feature Engineering Machine Translation +1

Paper
Add Code

One Size Does Not Fit All: Comparing NMT Representations of Different Granularities

no code implementations • NAACL 2019 • Nadir Durrani, Fahim Dalvi, Hassan Sajjad, Yonatan Belinkov, Preslav Nakov

Recent work has shown that contextualized word representations derived from neural machine translation are a viable alternative to such from simple word predictions tasks.

Machine Translation NMT +1

Paper
Add Code

NeuroX: A Toolkit for Analyzing Individual Neurons in Neural Networks

2 code implementations • 21 Dec 2018 • Fahim Dalvi, Avery Nortonsmith, D. Anthony Bau, Yonatan Belinkov, Hassan Sajjad, Nadir Durrani, James Glass

We present a toolkit to facilitate the interpretation and understanding of neural network models.

Paper
Code

What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models

1 code implementation • 21 Dec 2018 • Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Yonatan Belinkov, Anthony Bau, James Glass

We further present a comprehensive analysis of neurons with the aim to address the following questions: i) how localized or distributed are different linguistic properties in the models?

Language Modelling Machine Translation +1

Paper
Code

Identifying and Controlling Important Neurons in Neural Machine Translation

no code implementations • ICLR 2019 • Anthony Bau, Yonatan Belinkov, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, James Glass

Neural machine translation (NMT) models learn representations containing substantial linguistic information.

Machine Translation NMT +1

Paper
Add Code

Incremental Decoding and Training Methods for Simultaneous Translation in Neural Machine Translation

no code implementations • NAACL 2018 • Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Stephan Vogel

We address the problem of simultaneous translation by modifying the Neural MT decoder to operate with dynamically built encoder and attention.

Decoder Machine Translation +1

Paper
Add Code

Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks

1 code implementation • IJCNLP 2017 • Yonatan Belinkov, Lluís Màrquez, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, James Glass

In this paper, we investigate the representations learned at different layers of NMT encoders.

Machine Translation NMT +3

Paper
Code

Understanding and Improving Morphological Learning in the Neural Machine Translation Decoder

no code implementations • IJCNLP 2017 • Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Yonatan Belinkov, Stephan Vogel

End-to-end training makes the neural machine translation (NMT) architecture simpler, yet elegant compared to traditional statistical machine translation (SMT).

Decoder Machine Translation +3

Paper
Add Code

Challenging Language-Dependent Segmentation for Arabic: An Application to Machine Translation and Part-of-Speech Tagging

no code implementations • ACL 2017 • Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Ahmed Abdelali, Yonatan Belinkov, Stephan Vogel

Word segmentation plays a pivotal role in improving any Arabic NLP application.

Machine Translation Part-Of-Speech Tagging +5

Paper
Add Code

Neural Machine Translation Training in a Multi-Domain Scenario

no code implementations • IWSLT 2017 • Hassan Sajjad, Nadir Durrani, Fahim Dalvi, Yonatan Belinkov, Stephan Vogel

Model stacking works best when training begins with the furthest out-of-domain data and the model is incrementally fine-tuned with the next furthest domain and so on.

Machine Translation Translation

Paper
Add Code

Statistical Models for Unsupervised, Semi-Supervised Supervised Transliteration Mining

no code implementations • CL 2017 • Hassan Sajjad, Helmut Schmid, Alex Fraser, er, Hinrich Sch{\"u}tze

After training, the unlabeled data is disambiguated based on the posterior probabilities of the two sub-models.

Transliteration

Paper
Add Code

What do Neural Machine Translation Models Learn about Morphology?

1 code implementation • ACL 2017 • Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, James Glass

Neural machine translation (MT) models obtain state-of-the-art performance while maintaining a simple, end-to-end architecture.

Decoder Machine Translation +2

Paper
Code

QCRI Live Speech Translation System

no code implementations • EACL 2017 • Fahim Dalvi, Yifan Zhang, Sameer Khurana, Nadir Durrani, Hassan Sajjad, Ahmed Abdelali, Hamdy Mubarak, Ahmed Ali, Stephan Vogel

This paper presents QCRI{'}s Arabic-to-English live speech translation system.

Machine Translation Speech Recognition +1

Paper
Add Code

The SUMMA Platform Prototype

no code implementations • EACL 2017 • Renars Liepins, Ulrich Germann, Guntis Barzdins, Alex Birch, ra, Steve Renals, Susanne Weber, Peggy van der Kreeft, Herv{\'e} Bourlard, Jo{\~a}o Prieto, Ond{\v{r}}ej Klejch, Peter Bell, Alex Lazaridis, ros, Alfonso Mendes, Sebastian Riedel, Mariana S. C. Almeida, Pedro Balage, Shay B. Cohen, Tomasz Dwojak, Philip N. Garner, Andreas Giefer, Marcin Junczys-Dowmunt, Hina Imran, David Nogueira, Ahmed Ali, Mir, Sebasti{\~a}o a, Andrei Popescu-Belis, Lesly Miculicich Werlen, Nikos Papasarantopoulos, Abiola Obamuyide, Clive Jones, Fahim Dalvi, Andreas Vlachos, Yang Wang, Sibo Tong, Rico Sennrich, Nikolaos Pappas, Shashi Narayan, Marco Damonte, Nadir Durrani, Sameer Khurana, Ahmed Abdelali, Hassan Sajjad, Stephan Vogel, David Sheppey, Chris Hernon, Jeff Mitchell

We present the first prototype of the SUMMA Platform: an integrated platform for multilingual media monitoring.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

QCRI Machine Translation Systems for IWSLT 16

no code implementations • 14 Jan 2017 • Nadir Durrani, Fahim Dalvi, Hassan Sajjad, Stephan Vogel

This paper describes QCRI's machine translation systems for the IWSLT 2016 evaluation campaign.

Domain Adaptation Language Modelling +3

Paper
Add Code

A Deep Fusion Model for Domain Adaptation in Phrase-based MT

no code implementations • COLING 2016 • Nadir Durrani, Hassan Sajjad, Shafiq Joty, Ahmed Abdelali

We present a novel fusion model for domain adaptation in Statistical Machine Translation.

Domain Adaptation Machine Translation +2

Paper
Add Code

QCRI @ DSL 2016: Spoken Arabic Dialect Identification Using Textual Features

no code implementations • WS 2016 • Mohamed Eldesouki, Fahim Dalvi, Hassan Sajjad, Kareem Darwish

We submitted four runs to the Arabic sub-task.

Dialect Identification Machine Translation +1

Paper
Add Code

Applications of Online Deep Learning for Crisis Response Using Social Media Information

no code implementations • 4 Oct 2016 • Dat Tien Nguyen, Shafiq Joty, Muhammad Imran, Hassan Sajjad, Prasenjit Mitra

During natural or man-made disasters, humanitarian response organizations look for useful information to support their decision-making processes.

Decision Making Disaster Response +3

Paper
Add Code

Rapid Classification of Crisis-Related Data on Social Networks using Convolutional Neural Networks

no code implementations • 12 Aug 2016 • Dat Tien Nguyen, Kamela Ali Al Mannai, Shafiq Joty, Hassan Sajjad, Muhammad Imran, Prasenjit Mitra

The current state-of-the-art classification methods require a significant amount of labeled data specific to a particular event for training plus a lot of feature engineering to achieve best results.

BIG-bench Machine Learning Classification +2

Paper
Add Code

Egyptian Arabic to English Statistical Machine Translation System for NIST OpenMT'2015

no code implementations • 18 Jun 2016 • Hassan Sajjad, Nadir Durrani, Francisco Guzman, Preslav Nakov, Ahmed Abdelali, Stephan Vogel, Wael Salloum, Ahmed El Kholy, Nizar Habash

The competition focused on informal dialectal Arabic, as used in SMS, chat, and speech.

Language Modelling Translation +1

Paper
Add Code

Eyes Don't Lie: Predicting Machine Translation Quality Using Eye Movement

no code implementations • NAACL 2016 • Hassan Sajjad, Francisco Guzm{\'a}n, Nadir Durrani, Ahmed Abdelali, Houda Bouamor, Irina Temnikova, Stephan Vogel

Machine Translation Translation

Paper
Add Code

How to Avoid Unwanted Pregnancies: Domain Adaptation using Neural Network Models

no code implementations • EMNLP 2015 • Shafiq Joty, Hassan Sajjad, Nadir Durrani, Kamla Al-Mannai, Ahmed Abdelali, Stephan Vogel

Domain Adaptation Language Modelling +3

Paper
Add Code

How do Humans Evaluate Machine Translation

no code implementations • WS 2015 • Francisco Guzm{\'a}n, Ahmed Abdelali, Irina Temnikova, Hassan Sajjad, Stephan Vogel

Machine Translation Translation

Paper
Add Code

An Unsupervised Method for Discovering Lexical Variations in Roman Urdu Informal Text

no code implementations • EMNLP 2015 • Abdul Rafae, Abdul Qayyum, Muhammad Moeenuddin, Asim Karim, Hassan Sajjad, Faisal Kamiran

Machine Translation Part-Of-Speech Tagging

Paper
Add Code

QCMUQ@QALB-2015 Shared Task: Combining Character level MT and Error-tolerant Finite-State Recognition for Arabic Spelling Correction

no code implementations • WS 2015 • Houda Bouamor, Hassan Sajjad, Nadir Durrani, Kemal Oflazer

Language Modelling Machine Translation +1

Paper
Add Code

Verifiably Effective Arabic Dialect Identification

no code implementations • EMNLP 2014 • Kareem Darwish, Hassan Sajjad, Hamdy Mubarak

Dialect Identification

Paper
Add Code

Unsupervised Word Segmentation Improves Dialectal Arabic to English Machine Translation

no code implementations • WS 2014 • Kamla Al-Mannai, Hassan Sajjad, Alaa Khader, Fahad Al Obaidli, Preslav Nakov, Stephan Vogel

Machine Translation Translation

Paper
Add Code

The AMARA Corpus: Building Parallel Language Resources for the Educational Domain

no code implementations • LREC 2014 • Ahmed Abdelali, Francisco Guzman, Hassan Sajjad, Stephan Vogel

This paper presents the AMARA corpus of on-line educational content: a new parallel corpus of educational video subtitles, multilingually aligned for 20 languages, i. e. 20 monolingual corpora and 190 parallel corpora.

Machine Translation Translation