no code implementations • Findings (EMNLP) 2021 • Lei Shu, Yassine Benajiba, Saab Mansour, Yi Zhang
In this work, we address the open-world classification problem with a method called ODIST, open world classification via distributionally shifted instances.
no code implementations • 9 Mar 2024 • Shamik Roy, Sailik Sengupta, Daniele Bonadiman, Saab Mansour, Arshit Gupta
To study this, we propose the problem of faithful planning in TODs that needs to resolve user intents by following predefined flows and preserving API dependencies.
no code implementations • 7 Mar 2024 • Yuwei Zhang, Siffi Singh, Sailik Sengupta, Igor Shalyminov, Hang Su, Hwanjun Song, Saab Mansour
The triplet task gauges the model's understanding of two semantic concepts paramount in real-world conversational systems-- negation and implicature.
1 code implementation • 6 Mar 2024 • Jianfeng He, Hang Su, Jason Cai, Igor Shalyminov, Hwanjun Song, Saab Mansour
Semi-supervised dialogue summarization (SSDS) leverages model-generated summaries to reduce reliance on human-labeled data and improve the performance of summarization models.
Abstractive Text Summarization Natural Language Understanding
no code implementations • 5 Mar 2024 • Hossein Aboutalebi, Hwanjun Song, Yusheng Xie, Arshit Gupta, Justin Sun, Hang Su, Igor Shalyminov, Nikolaos Pappas, Siffi Singh, Saab Mansour
Development of multimodal interactive systems is hindered by the lack of rich, multimodal (text, images) conversational data, which is needed in large quantities for LLMs.
no code implementations • 5 Mar 2024 • Bryan Li, Tamer Alkhouli, Daniele Bonadiman, Nikolaos Pappas, Saab Mansour
xSTREET exposes a gap in base LLM performance between English and non-English reasoning tasks.
1 code implementation • 20 Feb 2024 • Liyan Tang, Igor Shalyminov, Amy Wing-mei Wong, Jon Burnsky, Jake W. Vincent, Yu'an Yang, Siffi Singh, Song Feng, Hwanjun Song, Hang Su, Lijia Sun, Yi Zhang, Saab Mansour, Kathleen McKeown
We find that there are diverse errors and error distributions in model-generated summaries and that non-LLM based metrics can capture all error types better than LLM-based evaluators.
no code implementations • 5 Feb 2024 • James Y. Huang, Sailik Sengupta, Daniele Bonadiman, Yi-An Lai, Arshit Gupta, Nikolaos Pappas, Saab Mansour, Katrin Kirchhoff, Dan Roth
Current work focuses on alignment at model training time, through techniques such as Reinforcement Learning with Human Feedback (RLHF).
no code implementations • 20 Oct 2023 • Hwanjun Song, Igor Shalyminov, Hang Su, Siffi Singh, Kaisheng Yao, Saab Mansour
Our experiments show that DisCal outperforms prior methods in abstractive summarization distillation, producing highly abstractive and informative summaries.
no code implementations • 23 Sep 2023 • Sam Davidson, Salvatore Romeo, Raphael Shu, James Gung, Arshit Gupta, Saab Mansour, Yi Zhang
One of the major impediments to the development of new task-oriented dialogue (TOD) systems is the need for human evaluation at multiple stages and iterations of the development process.
2 code implementations • 4 May 2023 • James Gung, Emily Moeng, Wesley Rose, Arshit Gupta, Yi Zhang, Saab Mansour
Existing task-oriented dialogue datasets, which were collected to benchmark dialogue systems mainly in written human-to-bot settings, are not representative of real customer support conversations and do not provide realistic benchmarks for systems that are applied to natural data.
2 code implementations • 25 Apr 2023 • James Gung, Raphael Shu, Emily Moeng, Wesley Rose, Salvatore Romeo, Yassine Benajiba, Arshit Gupta, Saab Mansour, Yi Zhang
With increasing demand for and adoption of virtual assistants, recent work has investigated ways to accelerate bot schema design through the automatic induction of intents or the induction of slots and dialogue states.
no code implementations • 16 Feb 2023 • Shamik Roy, Raphael Shu, Nikolaos Pappas, Elman Mansimov, Yi Zhang, Saab Mansour, Dan Roth
Conventional text style transfer approaches focus on sentence-level style transfer without considering contextual information, and the style is described with attributes (e. g., formality).
no code implementations • 20 Dec 2022 • Raphael Shu, Elman Mansimov, Tamer Alkhouli, Nikolaos Pappas, Salvatore Romeo, Arshit Gupta, Saab Mansour, Yi Zhang, Dan Roth
The conversational model interacts with the environment by generating and executing programs triggering a set of pre-defined APIs.
1 code implementation • 15 Dec 2022 • Denis Emelin, Daniele Bonadiman, Sawsan Alqahtani, Yi Zhang, Saab Mansour
Pre-trained language models (PLM) have advanced the state-of-the-art across NLP applications, but lack domain-specific knowledge that does not naturally occur in pre-training data.
1 code implementation • 4 Dec 2022 • Han He, Song Feng, Daniele Bonadiman, Yi Zhang, Saab Mansour
DataFlow has been emerging as a new paradigm for building task-oriented chatbots due to its expressive semantic representations of the dialogue tasks.
no code implementations • 8 Nov 2022 • Soumajyoti Sarkar, Kaixiang Lin, Sailik Sengupta, Leonard Lausen, Sheng Zha, Saab Mansour
While prior research studies have tried to adapt these multilingual models for dialectal variants of Arabic, it still remains a challenging problem owing to the lack of sufficient monolingual dialectal data and parallel translation data of such dialectal variants.
1 code implementation • 10 Oct 2022 • Asa Cooper Stickland, Sailik Sengupta, Jason Krone, Saab Mansour, He He
To benchmark the performance of pretrained multilingual language models, we construct noisy datasets covering five languages and four NLP tasks and observe a clear gap in the performance between clean and noisy data in the zero-shot cross-lingual setting.
Data Augmentation Pretrained Multilingual Language Models +1
1 code implementation • ACL 2022 • Aaron Mueller, Jason Krone, Salvatore Romeo, Saab Mansour, Elman Mansimov, Yi Zhang, Dan Roth
Label semantic aware systems have leveraged this information for improved text classification performance during fine-tuning and prediction.
no code implementations • Findings (EMNLP) 2021 • Sawsan Alqahtani, Garima Lalwani, Yi Zhang, Salvatore Romeo, Saab Mansour
Recent studies have proposed different methods to improve multilingual word representations in contextualized settings including techniques that align between source and target embedding spaces.
1 code implementation • EMNLP 2021 • M Saiful Bari, Batool Haider, Saab Mansour
Even though large pre-trained multilingual models (e. g. mBERT, XLM-R) have led to significant performance gains on a wide range of cross-lingual NLP tasks, success on many downstream tasks still relies on the availability of sufficient annotated data.
no code implementations • ACL (MetaNLP) 2021 • Weijia Xu, Batool Haider, Jason Krone, Saab Mansour
Multilingual pre-trained contextual embedding models (Devlin et al., 2019) have achieved impressive performance on zero-shot cross-lingual transfer tasks.
1 code implementation • NAACL 2021 • Piyawat Lertvittayakumjorn, Daniele Bonadiman, Saab Mansour
Practically, some combinations of slot values can be invalid according to external knowledge.
1 code implementation • EMNLP (NLP4ConvAI) 2021 • Sailik Sengupta, Jason Krone, Saab Mansour
In this work, we investigate how robust IC/SL models are to noisy data.
3 code implementations • EMNLP 2020 • Weijia Xu, Batool Haider, Saab Mansour
We introduce MultiATIS++, a new multilingual NLU corpus that extends the Multilingual ATIS corpus to nine languages across four language families, and evaluate our method using the corpus.
no code implementations • LREC 2012 • Saab Mansour, Hermann Ney
Next, we try a different strategy, where we combine the different segmentation methods rather than the different segmentation schemes.
no code implementations • LREC 2012 • Mahdi Khademian, Kaveh Taghipour, Saab Mansour, Shahram Khadivi
Achieving accurate translation, especially in multiple domain documents with statistical machine translation systems, requires more and more bilingual texts and this need becomes more critical when training such systems for language pairs with scarce training data.