Search Results for author: Saab Mansour

Found 34 papers, 12 papers with code

ODIST: Open World Classification via Distributionally Shifted Instances

no code implementations • Findings (EMNLP) 2021 • Lei Shu, Yassine Benajiba, Saab Mansour, Yi Zhang

In this work, we address the open-world classification problem with a method called ODIST, open world classification via distributionally shifted instances.

Classification Language Modelling

Paper
Add Code

FLAP: Flow-Adhering Planning with Constrained Decoding in LLMs

no code implementations • 9 Mar 2024 • Shamik Roy, Sailik Sengupta, Daniele Bonadiman, Saab Mansour, Arshit Gupta

To study this, we propose the problem of faithful planning in TODs that needs to resolve user intents by following predefined flows and preserving API dependencies.

Paper
Add Code

Can Your Model Tell a Negation from an Implicature? Unravelling Challenges With Intent Encoders

no code implementations • 7 Mar 2024 • Yuwei Zhang, Siffi Singh, Sailik Sengupta, Igor Shalyminov, Hang Su, Hwanjun Song, Saab Mansour

The triplet task gauges the model's understanding of two semantic concepts paramount in real-world conversational systems-- negation and implicature.

Clustering intent-classification +2

Paper
Add Code

Semi-Supervised Dialogue Abstractive Summarization via High-Quality Pseudolabel Selection

1 code implementation • 6 Mar 2024 • Jianfeng He, Hang Su, Jason Cai, Igor Shalyminov, Hwanjun Song, Saab Mansour

Semi-supervised dialogue summarization (SSDS) leverages model-generated summaries to reduce reliance on human-labeled data and improve the performance of summarization models.

Abstractive Text Summarization Natural Language Understanding

Paper
Code

MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets

no code implementations • 5 Mar 2024 • Hossein Aboutalebi, Hwanjun Song, Yusheng Xie, Arshit Gupta, Justin Sun, Hang Su, Igor Shalyminov, Nikolaos Pappas, Siffi Singh, Saab Mansour

Development of multimodal interactive systems is hindered by the lack of rich, multimodal (text, images) conversational data, which is needed in large quantities for LLMs.

Image-text matching Retrieval +1

Paper
Add Code

Eliciting Better Multilingual Structured Reasoning from LLMs through Code

no code implementations • 5 Mar 2024 • Bryan Li, Tamer Alkhouli, Daniele Bonadiman, Nikolaos Pappas, Saab Mansour

xSTREET exposes a gap in base LLM performance between English and non-English reasoning tasks.

Machine Translation

Paper
Add Code

TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

1 code implementation • 20 Feb 2024 • Liyan Tang, Igor Shalyminov, Amy Wing-mei Wong, Jon Burnsky, Jake W. Vincent, Yu'an Yang, Siffi Singh, Song Feng, Hwanjun Song, Hang Su, Lijia Sun, Yi Zhang, Saab Mansour, Kathleen McKeown

We find that there are diverse errors and error distributions in model-generated summaries and that non-LLM based metrics can capture all error types better than LLM-based evaluators.

Hallucination News Summarization +2

Paper
Code

DeAL: Decoding-time Alignment for Large Language Models

no code implementations • 5 Feb 2024 • James Y. Huang, Sailik Sengupta, Daniele Bonadiman, Yi-An Lai, Arshit Gupta, Nikolaos Pappas, Saab Mansour, Katrin Kirchhoff, Dan Roth

Current work focuses on alignment at model training time, through techniques such as Reinforcement Learning with Human Feedback (RLHF).

Paper
Add Code

Enhancing Abstractiveness of Summarization Models through Calibrated Distillation

no code implementations • 20 Oct 2023 • Hwanjun Song, Igor Shalyminov, Hang Su, Siffi Singh, Kaisheng Yao, Saab Mansour

Our experiments show that DisCal outperforms prior methods in abstractive summarization distillation, producing highly abstractive and informative summaries.

Abstractive Text Summarization Informativeness +1

Paper
Add Code

User Simulation with Large Language Models for Evaluating Task-Oriented Dialogue

no code implementations • 23 Sep 2023 • Sam Davidson, Salvatore Romeo, Raphael Shu, James Gung, Arshit Gupta, Saab Mansour, Yi Zhang

One of the major impediments to the development of new task-oriented dialogue (TOD) systems is the need for human evaluation at multiple stages and iterations of the development process.

In-Context Learning User Simulation

Paper
Add Code

NatCS: Eliciting Natural Customer Support Dialogues

2 code implementations • 4 May 2023 • James Gung, Emily Moeng, Wesley Rose, Arshit Gupta, Yi Zhang, Saab Mansour

Existing task-oriented dialogue datasets, which were collected to benchmark dialogue systems mainly in written human-to-bot settings, are not representative of real customer support conversations and do not provide realistic benchmarks for systems that are applied to natural data.

Dialogue Act Classification

Paper
Code

Intent Induction from Conversations for Task-Oriented Dialogue Track at DSTC 11

2 code implementations • 25 Apr 2023 • James Gung, Raphael Shu, Emily Moeng, Wesley Rose, Salvatore Romeo, Yassine Benajiba, Arshit Gupta, Saab Mansour, Yi Zhang

With increasing demand for and adoption of virtual assistants, recent work has investigated ways to accelerate bot schema design through the automatic induction of intents or the induction of slots and dialogue states.

Paper
Code

Conversation Style Transfer using Few-Shot Learning

no code implementations • 16 Feb 2023 • Shamik Roy, Raphael Shu, Nikolaos Pappas, Elman Mansimov, Yi Zhang, Saab Mansour, Dan Roth

Conventional text style transfer approaches focus on sentence-level style transfer without considering contextual information, and the style is described with attributes (e. g., formality).

Few-Shot Learning In-Context Learning +5

Paper
Add Code

Dialog2API: Task-Oriented Dialogue with API Description and Example Programs

no code implementations • 20 Dec 2022 • Raphael Shu, Elman Mansimov, Tamer Alkhouli, Nikolaos Pappas, Salvatore Romeo, Arshit Gupta, Saab Mansour, Yi Zhang, Dan Roth

The conversational model interacts with the environment by generating and executing programs triggering a set of pre-defined APIs.

In-Context Learning Semantic Parsing +1

Paper
Add Code

Injecting Domain Knowledge in Language Models for Task-Oriented Dialogue Systems

1 code implementation • 15 Dec 2022 • Denis Emelin, Daniele Bonadiman, Sawsan Alqahtani, Yi Zhang, Saab Mansour

Pre-trained language models (PLM) have advanced the state-of-the-art across NLP applications, but lack domain-specific knowledge that does not naturally occur in pre-training data.

Knowledge Probing Response Generation +1

Paper
Code

DFEE: Interactive DataFlow Execution and Evaluation Kit

1 code implementation • 4 Dec 2022 • Han He, Song Feng, Daniele Bonadiman, Yi Zhang, Saab Mansour

DataFlow has been emerging as a new paradigm for building task-oriented chatbots due to its expressive semantic representations of the dialogue tasks.

Benchmarking Scheduling

Paper
Code

Parameter and Data Efficient Continual Pre-training for Robustness to Dialectal Variance in Arabic

no code implementations • 8 Nov 2022 • Soumajyoti Sarkar, Kaixiang Lin, Sailik Sengupta, Leonard Lausen, Sheng Zha, Saab Mansour

While prior research studies have tried to adapt these multilingual models for dialectal variants of Arabic, it still remains a challenging problem owing to the lack of sufficient monolingual dialectal data and parallel translation data of such dialectal variants.

Avg Language Modelling +1

Paper
Add Code

Robustification of Multilingual Language Models to Real-world Noise in Crosslingual Zero-shot Settings with Robust Contrastive Pretraining

1 code implementation • 10 Oct 2022 • Asa Cooper Stickland, Sailik Sengupta, Jason Krone, Saab Mansour, He He

To benchmark the performance of pretrained multilingual language models, we construct noisy datasets covering five languages and four NLP tasks and observe a clear gap in the performance between clean and noisy data in the zero-shot cross-lingual setting.

Data Augmentation Pretrained Multilingual Language Models +1

Paper
Code

Label Semantic Aware Pre-training for Few-shot Text Classification

1 code implementation • ACL 2022 • Aaron Mueller, Jason Krone, Salvatore Romeo, Saab Mansour, Elman Mansimov, Yi Zhang, Dan Roth

Label semantic aware systems have leveraged this information for improved text classification performance during fine-tuning and prediction.

Few-Shot Text Classification Sentence +2

Paper
Code

Using Optimal Transport as Alignment Objective for fine-tuning Multilingual Contextualized Embeddings

no code implementations • Findings (EMNLP) 2021 • Sawsan Alqahtani, Garima Lalwani, Yi Zhang, Salvatore Romeo, Saab Mansour

Recent studies have proposed different methods to improve multilingual word representations in contextualized settings including techniques that align between source and target embedding spaces.

Cross-Lingual Transfer Word Alignment

Paper
Add Code

Nearest Neighbour Few-Shot Learning for Cross-lingual Classification

1 code implementation • EMNLP 2021 • M Saiful Bari, Batool Haider, Saab Mansour

Even though large pre-trained multilingual models (e. g. mBERT, XLM-R) have led to significant performance gains on a wide range of cross-lingual NLP tasks, success on many downstream tasks still relies on the availability of sufficient annotated data.

Classification Few-Shot Learning +1

Paper
Code

Soft Layer Selection with Meta-Learning for Zero-Shot Cross-Lingual Transfer

no code implementations • ACL (MetaNLP) 2021 • Weijia Xu, Batool Haider, Jason Krone, Saab Mansour

Multilingual pre-trained contextual embedding models (Devlin et al., 2019) have achieved impressive performance on zero-shot cross-lingual transfer tasks.

Cross-Lingual Natural Language Inference Meta-Learning +1

Paper
Add Code

Knowledge-Driven Slot Constraints for Goal-Oriented Dialogue Systems

1 code implementation • NAACL 2021 • Piyawat Lertvittayakumjorn, Daniele Bonadiman, Saab Mansour

Practically, some combinations of slot values can be invalid according to external knowledge.

Benchmarking Goal-Oriented Dialogue Systems

Paper
Code

On the Robustness of Intent Classification and Slot Labeling in Goal-oriented Dialog Systems to Real-world Noise

1 code implementation • EMNLP (NLP4ConvAI) 2021 • Sailik Sengupta, Jason Krone, Saab Mansour

In this work, we investigate how robust IC/SL models are to noisy data.

Data Augmentation Goal-Oriented Dialog +3

Paper
Code

End-to-End Slot Alignment and Recognition for Cross-Lingual NLU

3 code implementations • EMNLP 2020 • Weijia Xu, Batool Haider, Saab Mansour

We introduce MultiATIS++, a new multilingual NLU corpus that extends the Multilingual ATIS corpus to nine languages across four language families, and evaluate our method using the corpus.

Cross-Lingual Transfer Goal-Oriented Dialog +6

Paper
Code

Spelling Correction of User Search Queries through Statistical Machine Translation

no code implementations • EMNLP 2015 • Sa{\v{s}}a Hasan, Carmen Heger, Saab Mansour

Machine Translation Spelling Correction +1

Paper
Add Code

Improved Sentence-Level Arabic Dialect Classification

no code implementations • WS 2014 • Christoph Tillmann, Saab Mansour, Yaser Al-Onaizan

Classification Feature Engineering +4

Paper
Add Code

Unsupervised Adaptation for Statistical Machine Translation

no code implementations • WS 2014 • Saab Mansour, Hermann Ney

Domain Adaptation Language Modelling +3

Paper
Add Code

The RWTH Aachen Machine Translation System for WMT 2013

no code implementations • WS 2013 • Stephan Peitz, Saab Mansour, Jan-Thorsten Peter, Christoph Schmidt, Joern Wuebker, Matthias Huck, Markus Freitag, Hermann Ney

Domain Adaptation Language Modelling +2

Paper
Add Code

Joint WMT 2013 Submission of the QUAERO Project

no code implementations • WS 2013 • Stephan Peitz, Saab Mansour, Matthias Huck, Markus Freitag, Hermann Ney, Eunah Cho, Teresa Herrmann, Mohammed Mediani, Jan Niehues, Alex Waibel, Alex Allauzen, er, Quoc Khanh Do, Bianka Buschbeck, W, Tonio macher

Language Modelling Machine Translation +1

Paper
Add Code

Phrase Training Based Adaptation for Statistical Machine Translation

no code implementations • NAACL 2013 • Saab Mansour, Hermann Ney

Domain Adaptation Machine Translation +1

Paper
Add Code

Jane 2: Open Source Phrase-based and Hierarchical Statistical Machine Translation

no code implementations • COLING 2012 • Joern Wuebker, Matthias Huck, Stephan Peitz, Malte Nuhn, Markus Freitag, Jan-Thorsten Peter, Saab Mansour, Hermann Ney

Domain Adaptation Machine Translation +1

Paper
Add Code

Arabic-Segmentation Combination Strategies for Statistical Machine Translation

no code implementations • LREC 2012 • Saab Mansour, Hermann Ney

Next, we try a different strategy, where we combine the different segmentation methods rather than the different segmentation schemes.

Machine Translation Segmentation +1

Paper
Add Code

A Holistic Approach to Bilingual Sentence Fragment Extraction from Comparable Corpora

no code implementations • LREC 2012 • Mahdi Khademian, Kaveh Taghipour, Saab Mansour, Shahram Khadivi

Achieving accurate translation, especially in multiple domain documents with statistical machine translation systems, requires more and more bilingual texts and this need becomes more critical when training such systems for language pairs with scarce training data.

Boundary Detection Machine Translation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.