Search Results for author: Mona Diab

Found 119 papers, 14 papers with code

Multitask Learning for Cross-Lingual Transfer of Broad-coverage Semantic Dependencies

no code implementations EMNLP 2020 Maryam Aminian, Mohammad Sadegh Rasooli, Mona Diab

We describe a method for developing broad-coverage semantic dependency parsers for languages for which no semantically annotated resource is available.

Cross-Lingual Transfer

Active Learning for Rumor Identification on Social Media

no code implementations Findings (EMNLP) 2021 Parsa Farinneya, Mohammad Mahdi Abdollah Pour, Sardar Hamidian, Mona Diab

We discuss the impact of multiple classifiers on a limited amount of annotated data followed by an interactive approach to gradually update the models by adding the least certain samples (LCS) from the pool of unlabeled data.

Active Learning Transfer Learning

Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs

1 code implementation26 Nov 2021 Peter Hase, Mona Diab, Asli Celikyilmaz, Xian Li, Zornitsa Kozareva, Veselin Stoyanov, Mohit Bansal, Srinivasan Iyer

In this paper, we discuss approaches to detecting when models have beliefs about the world, and we improve on methods for updating model beliefs to be more truthful, with a focus on methods based on learned optimizers or hypernetworks.

AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization

1 code implementation11 Nov 2021 Alexander R. Fabbri, Xiaojian Wu, Srini Iyer, Haoran Li, Mona Diab

Our pipeline gathers annotations for all subtasks involved in answer summarization, including the selection of answer sentences relevant to the question, grouping these sentences based on perspectives, summarizing each perspective, and producing an overall summary.

Community Question Answering Data Augmentation

Discrete Cosine Transform as Universal Sentence Encoder

no code implementations ACL 2021 Nada Almarwani, Mona Diab

Modern sentence encoders are used to generate dense vector representations that capture the underlying linguistic characteristics for a sequence of words, including phrases, sentences, or paragraphs.

Question Answering Sentiment Analysis +1

Multi-Perspective Abstractive Answer Summarization

no code implementations17 Apr 2021 Alexander R. Fabbri, Xiaojian Wu, Srini Iyer, Mona Diab

A major obstacle for multi-perspective, abstractive answer summarization is the absence of a dataset to provide supervision for producing such summaries.

Community Question Answering

Predicting Directionality in Causal Relations in Text

1 code implementation25 Mar 2021 Pedram Hosseini, David A. Broniatowski, Mona Diab

In this work, we test the performance of two bidirectional transformer-based language models, BERT and SpanBERT, on predicting directionality in causal pairs in the textual content.

White Paper: Challenges and Considerations for the Creation of a Large Labelled Repository of Online Videos with Questionable Content

no code implementations25 Jan 2021 Thamar Solorio, Mahsa Shafaei, Christos Smailis, Mona Diab, Theodore Giannakopoulos, Heng Ji, Yang Liu, Rada Mihalcea, Smaranda Muresan, Ioannis Kakadiaris

This white paper presents a summary of the discussions regarding critical considerations to develop an extensive repository of online videos annotated with labels indicating questionable content.

Detecting Urgency Status of Crisis Tweets: A Transfer Learning Approach for Low Resource Languages

1 code implementation COLING 2020 Efsun Sarioglu Kayi, Linyong Nan, Bohan Qu, Mona Diab, Kathleen McKeown

We adopt cross-lingual embeddings constructed using different methods to extract features of the tweets, including a few state-of-the-art contextual embeddings such as BERT, RoBERTa and XLM-R. We train classifiers of different architectures on the extracted features.

Transfer Learning

Detecting Hallucinated Content in Conditional Neural Sequence Generation

1 code implementation Findings (ACL) 2021 Chunting Zhou, Graham Neubig, Jiatao Gu, Mona Diab, Paco Guzman, Luke Zettlemoyer, Marjan Ghazvininejad

Neural sequence models can generate highly fluent sentences, but recent studies have also shown that they are also prone to hallucinate additional content not supported by the input.

Abstractive Text Summarization Machine Translation

A Multitask Learning Approach for Diacritic Restoration

no code implementations ACL 2020 Sawsan Alqahtani, Ajay Mishra, Mona Diab

Such diacritics are often omitted in written text, increasing the number of possible pronunciations and meanings for a word.

Multi-Task Learning Part-Of-Speech Tagging

Mutlitask Learning for Cross-Lingual Transfer of Semantic Dependencies

no code implementations30 Apr 2020 Maryam Aminian, Mohammad Sadegh Rasooli, Mona Diab

We make use of supervised syntactic parsing as an auxiliary task in a multitask learning framework, and show that with different multitask learning settings, we consistently improve over the single-task baseline.

Cross-Lingual Transfer

DeSePtion: Dual Sequence Prediction and Adversarial Examples for Improved Fact-Checking

1 code implementation ACL 2020 Christopher Hidey, Tuhin Chakrabarty, Tariq Alhindi, Siddharth Varia, Kriste Krstovski, Mona Diab, Smaranda Muresan

The increased focus on misinformation has spurred development of data and systems for detecting the veracity of a claim as well as retrieving authoritative evidence.

Fact Checking Misinformation

Learning to Classify Intents and Slot Labels Given a Handful of Examples

no code implementations WS 2020 Jason Krone, Yi Zhang, Mona Diab

Prototypical networks achieves significant gains in IC performance on the ATIS and TOP datasets, while both prototypical networks and MAML outperform the baseline with respect to SF on all three datasets.

Few-Shot Learning Fine-tuning +3

Diversity, Density, and Homogeneity: Quantitative Characteristic Metrics for Text Collections

no code implementations LREC 2020 Yi-An Lai, Xuan Zhu, Yi Zhang, Mona Diab

Summarizing data samples by quantitative measures has a long history, with descriptive statistics being a case in point.

Text Classification

Efficient Convolutional Neural Networks for Diacritic Restoration

no code implementations IJCNLP 2019 Sawsan Alqahtani, Ajay Mishra, Mona Diab

Diacritic restoration has gained importance with the growing need for machines to understand written texts.

Homograph Disambiguation Through Selective Diacritic Restoration

no code implementations WS 2019 Sawsan Alqahtani, Hanan Aldarmaki, Mona Diab

Diacritic restoration could theoretically help disambiguate these words, but in practice, the increase in overall sparsity leads to performance degradation in NLP applications.

Machine Translation Part-Of-Speech Tagging +2

Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data

no code implementations IJCNLP 2019 Denis Peskov, Nancy Clarke, Jason Krone, Brigi Fodor, Yi Zhang, Adel Youssef, Mona Diab

With a total of over 81K dialogues harvested across six domains, MultiDoGO is over 8 times the size of MultiWOZ, the other largest comparable dialogue dataset currently available to the public.

Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues

1 code implementation WS 2019 Or Levi, Pedram Hosseini, Mona Diab, David A. Broniatowski

As avenues for future work, we consider studying additional linguistic features related to the humor aspect, and enriching the data with current news events, to help identify a political or social message.

Language Modelling Misinformation

WASA: A Web Application for Sequence Annotation

no code implementations LREC 2018 Fahad AlGhamdi, Mona Diab

Data annotation is an important and necessary task for all NLP applications.

CASA-NLU: Context-Aware Self-Attentive Natural Language Understanding for Task-Oriented Chatbots

no code implementations IJCNLP 2019 Arshit Gupta, Peng Zhang, Garima Lalwani, Mona Diab

In this work, we propose a context-aware self-attentive NLU (CASA-NLU) model that uses multiple signals, such as previous intents, slots, dialog acts and utterances over a variable context window, in addition to the current user utterance.

Dialogue Management Intent Classification +2

Efficient Sentence Embedding using Discrete Cosine Transform

1 code implementation IJCNLP 2019 Nada Almarwani, Hanan Aldarmaki, Mona Diab

Vector averaging remains one of the most popular sentence embedding methods in spite of its obvious disregard for syntactic structure.

Classification General Classification +1

Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task

no code implementations WS 2018 Gustavo Aguilar, Fahad AlGhamdi, Victor Soto, Mona Diab, Julia Hirschberg, Thamar Solorio

In the third shared task of the Computational Approaches to Linguistic Code-Switching (CALCS) workshop, we focus on Named Entity Recognition (NER) on code-switched social-media data.

Named Entity Recognition NER

Does Causal Coherence Predict Online Spread of Social Media?

1 code implementation International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation 2019 Pedram Hosseini, Mona Diab, David A. Broniatowski

In this paper, we test the hypothesis that causal and semantic coherence are associated with online sharing of misinformative social media content using Coh-Metrix – a widely-used set of psycholinguistic measures.

Decision Making Misinformation

Leveraging Pretrained Word Embeddings for Part-of-Speech Tagging of Code Switching Data

no code implementations WS 2019 Fahad AlGhamdi, Mona Diab

In this paper, we address the problem of Part-of-Speech tagging (POS) in the context of linguistic code switching (CS).

Part-Of-Speech Tagging POS +1

Scalable Cross-Lingual Transfer of Neural Sentence Embeddings

no code implementations SEMEVAL 2019 Hanan Aldarmaki, Mona Diab

We develop and investigate several cross-lingual alignment approaches for neural sentence embedding models, such as the supervised inference classifier, InferSent, and sequential encoder-decoder models.

Cross-Lingual Transfer Sentence Embedding +1

Cross-Lingual Transfer of Semantic Roles: From Raw Text to Semantic Roles

no code implementations WS 2019 Maryam Aminian, Mohammad Sadegh Rasooli, Mona Diab

We describe a transfer method based on annotation projection to develop a dependency-based semantic role labeling system for languages for which no supervised linguistic information other than parallel data is available.

Cross-Lingual Transfer Semantic Role Labeling

Context-Aware Cross-Lingual Mapping

1 code implementation NAACL 2019 Hanan Aldarmaki, Mona Diab

Cross-lingual word vectors are typically obtained by fitting an orthogonal matrix that maps the entries of a bilingual dictionary from a source to a target vector space.

Document-level Sentence Embeddings +2

The ARIEL-CMU Systems for LoReHLT18

no code implementations24 Feb 2019 Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard Hovy, Alan W. black, Jaime Carbonell, Graham V. Horwood, Shabnam Tafreshi, Mona Diab, Efsun S. Kayi, Noura Farra, Kathleen McKeown

This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech (SF Text and Speech).

Machine Translation Translation

Team SWEEPer: Joint Sentence Extraction and Fact Checking with Pointer Networks

no code implementations WS 2018 Christopher Hidey, Mona Diab

We present experiments on the FEVER (Fact Extraction and VERification) task, a shared task that involves selecting sentences from Wikipedia and predicting whether a claim is supported by those sentences, refuted, or there is not enough information.

Fact Checking Information Retrieval +4

Predictive Linguistic Features of Schizophrenia

no code implementations SEMEVAL 2017 Efsun Sarioglu Kayi, Mona Diab, Luca Pauselli, Michael Compton, Glen Coppersmith

As such, we examine the writings of schizophrenia patients analyzing their syntax, semantics and pragmatics.

Evaluation of Unsupervised Compositional Representations

1 code implementation COLING 2018 Hanan Aldarmaki, Mona Diab

We evaluated various compositional models, from bag-of-words representations to compositional RNN-based models, on several extrinsic supervised and unsupervised evaluation benchmarks.

General Classification

Unsupervised Word Mapping Using Structural Similarities in Monolingual Embeddings

no code implementations TACL 2018 Hanan Aldarmaki, Mahesh Mohan, Mona Diab

We show empirically that the performance of bilingual correspondents learned using our proposed unsupervised method is comparable to that of using supervised bilingual correspondents from a seed dictionary.

Word Embeddings

Transferring Semantic Roles Using Translation and Syntactic Information

no code implementations IJCNLP 2017 Maryam Aminian, Mohammad Sadegh Rasooli, Mona Diab

Our paper addresses the problem of annotation projection for semantic role labeling for resource-poor languages using supervised annotations from a resource-rich language through parallel data.

Semantic Role Labeling Translation +1

GW\_QA at SemEval-2017 Task 3: Question Answer Re-ranking on Arabic Fora

no code implementations SEMEVAL 2017 Nada Almarwani, Mona Diab

This paper describes our submission to SemEval-2017 Task 3 Subtask D, {``}Question Answer Ranking in Arabic Community Question Answering{''}.

Answer Selection Community Question Answering +1

Arabic Textual Entailment with Word Embeddings

no code implementations WS 2017 Nada Almarwani, Mona Diab

Determining the textual entailment between texts is important in many NLP tasks, such as summarization, question answering, and information extraction and retrieval.

Machine Translation Natural Language Inference +2

A Layered Language Model based Hybrid Approach to Automatic Full Diacritization of Arabic

no code implementations WS 2017 Mohamed Al-Badrashiny, Abdelati Hawwari, Mona Diab

In this paper we present a system for automatic Arabic text diacritization using three levels of analysis granularity in a layered back off manner.

Arabic Text Diacritization Language Modelling +3

The Power of Language Music: Arabic Lemmatization through Patterns

no code implementations WS 2016 Mohammed Attia, Ayah Zirikly, Mona Diab

The interaction between roots and patterns in Arabic has intrigued lexicographers and morphologists for centuries.

Information Retrieval Lemmatization

The GW/LT3 VarDial 2016 Shared Task System for Dialects and Similar Languages Detection

no code implementations WS 2016 Ayah Zirikly, Bart Desmet, Mona Diab

This paper describes the GW/LT3 contribution to the 2016 VarDial shared task on the identification of similar languages (task 1) and Arabic dialects (task 2).

Feature Engineering

Processing Dialectal Arabic: Exploiting Variability and Similarity to Overcome Challenges and Discover Opportunities

no code implementations WS 2016 Mona Diab

We recently witnessed an exponential growth in dialectal Arabic usage in both textual data and speech recordings especially in social media.

Machine Translation

Automatic Verification and Augmentation of Multilingual Lexicons

no code implementations WS 2016 Maryam Aminian, Mohamed Al-Badrashiny, Mona Diab

We present an approach for automatic verification and augmentation of multilingual lexica.

Guidelines and Framework for a Large Scale Arabic Diacritized Corpus

no code implementations LREC 2016 Wajdi Zaghouani, Houda Bouamor, Abdelati Hawwari, Mona Diab, Ossama Obeid, Mahmoud Ghoneim, Sawsan Alqahtani, Kemal Oflazer

This paper presents the annotation guidelines developed as part of an effort to create a large scale manually diacritized corpus for various Arabic text genres.

SANA: A Large Scale Multi-Genre, Multi-Dialect Lexicon for Arabic Subjectivity and Sentiment Analysis

no code implementations LREC 2014 Muhammad Abdul-Mageed, Mona Diab

The computational treatment of subjectivity and sentiment in natural language is usually significantly improved by applying features exploiting lexical resources where entries are tagged with semantic orientation (e. g., positive, negative values).

Arabic Sentiment Analysis Machine Translation

MADAMIRA: A Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic

no code implementations LREC 2014 Arfath Pasha, Mohamed Al-Badrashiny, Mona Diab, Ahmed El Kholy, Esk, Ramy er, Nizar Habash, Manoj Pooleery, Owen Rambow, Ryan Roth

In this paper, we present MADAMIRA, a system for morphological analysis and disambiguation of Arabic that combines some of the best aspects of two previously commonly used systems for Arabic processing, MADA (Habash and Rambow, 2005; Habash et al., 2009; Habash et al., 2013) and AMIRA (Diab et al., 2007).

Chunking Lemmatization +6

LDC Arabic Treebanks and Associated Corpora: Data Divisions Manual

no code implementations22 Sep 2013 Mona Diab, Nizar Habash, Owen Rambow, Ryan Roth

The Linguistic Data Consortium (LDC) has developed hundreds of data corpora for natural language processing (NLP) research.

Annotations for Power Relations on Email Threads

no code implementations LREC 2012 Vinodkumar Prabhakaran, Huzaifa Neralwala, Owen Rambow, Mona Diab

In this paper, we describe a multi-layer annotation scheme for social power relations that are recognizable from online written interactions.

AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis

no code implementations LREC 2012 Muhammad Abdul-Mageed, Mona Diab

We present AWATIF, a multi-genre corpus of Modern Standard Arabic (MSA) labeled for subjectivity and sentiment analysis (SSA) at the sentence level.

Opinion Mining Sentiment Analysis

Simplified guidelines for the creation of Large Scale Dialectal Arabic Annotations

no code implementations LREC 2012 Heba Elfardy, Mona Diab

In this paper, we present a simplified Set of guidelines for detecting code switching in Arabic on the word/token level.

Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.