Search Results for author: Wael Hamza

Found 30 papers, 6 papers with code

Limitations of Knowledge Distillation for Zero-shot Transfer Learning

no code implementations • EMNLP (sustainlp) 2021 • Saleh Soltan, Haidar Khan, Wael Hamza

We demonstrate that in contradiction to the previous observation in the case of monolingual distillation, in multilingual settings, distillation during pretraining is more effective than distillation during fine-tuning for zero-shot transfer learning.

Knowledge Distillation Transfer Learning +1

Paper
Add Code

Controlled Data Generation via Insertion Operations for NLU

no code implementations • NAACL (ACL) 2022 • Manoj Kumar, Yuval Merhav, Haidar Khan, Rahul Gupta, Anna Rumshisky, Wael Hamza

Use of synthetic data is rapidly emerging as a realistic alternative to manually annotating live traffic for industry-scale model building.

intent-classification Intent Classification +4

Paper
Add Code

Attention Fusion: a light yet efficient late fusion mechanism for task adaptation in NLU

no code implementations • Findings (NAACL) 2022 • Jin Cao, Chandana Satya Prakash, Wael Hamza

However, given the trend of larger pre-trained models, fine-tuning these models for each downstream task is parameter-inefficient and computationally-expensive deeming this approach sub-optimal for adoption by NLU systems.

Language Modelling

Paper
Add Code

Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks

no code implementations • 5 Jan 2024 • Kevin Everson, Yile Gu, Huck Yang, Prashanth Gurunath Shivakumar, Guan-Ting Lin, Jari Kolehmainen, Ivan Bulyko, Ankur Gandhe, Shalini Ghosh, Wael Hamza, Hung-Yi Lee, Ariya Rastrow, Andreas Stolcke

In the realm of spoken language understanding (SLU), numerous natural language understanding (NLU) methodologies have been adapted by supplying large language models (LLMs) with transcribed speech instead of conventional written text.

In-Context Learning intent-classification +6

Paper
Add Code

Recipes for Sequential Pre-training of Multilingual Encoder and Seq2Seq Models

no code implementations • 14 Jun 2023 • Saleh Soltan, Andy Rosenbaum, Tobias Falke, Qin Lu, Anna Rumshisky, Wael Hamza

(2) Conversely, using an encoder to warm-start seq2seq training, we show that by unfreezing the encoder partway through training, we can match task performance of a from-scratch seq2seq model.

Language Modelling Masked Language Modeling

Paper
Add Code

Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data

no code implementations • 4 Apr 2023 • Vladislav Lialin, Stephen Rawls, David Chan, Shalini Ghosh, Anna Rumshisky, Wael Hamza

Currently popular video-text data mining approach via automatic speech recognition (ASR) used in HowTo100M provides low-quality captions that often do not refer to the video content.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

Low-Resource Compositional Semantic Parsing with Concept Pretraining

no code implementations • 24 Jan 2023 • Subendhu Rongali, Mukund Sridhar, Haidar Khan, Konstantine Arkoudas, Wael Hamza, Andrew McCallum

In this work, we present an architecture to perform such domain adaptation automatically, with only a small amount of metadata about the new domain and without any new training data (zero-shot) or with very few examples (few-shot).

Domain Adaptation Semantic Parsing

Paper
Add Code

CLASP: Few-Shot Cross-Lingual Data Augmentation for Semantic Parsing

no code implementations • 13 Oct 2022 • Andy Rosenbaum, Saleh Soltan, Wael Hamza, Amir Saffari, Marco Damonte, Isabel Groves

A bottleneck to developing Semantic Parsing (SP) models is the need for a large volume of human-labeled training data.

Data Augmentation Semantic Parsing

Paper
Add Code

LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging

no code implementations • COLING 2022 • Andy Rosenbaum, Saleh Soltan, Wael Hamza, Yannick Versley, Markus Boese

We present LINGUIST, a method for generating annotated data for Intent Classification and Slot Tagging (IC+ST), via fine-tuning AlexaTM 5B, a 5-billion-parameter multilingual sequence-to-sequence (seq2seq) model, on a flexible instruction prompt.

intent-classification Intent Classification +3

Paper
Add Code

AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

1 code implementation • 2 Aug 2022 • Saleh Soltan, Shankar Ananthakrishnan, Jack FitzGerald, Rahul Gupta, Wael Hamza, Haidar Khan, Charith Peris, Stephen Rawls, Andy Rosenbaum, Anna Rumshisky, Chandana Satya Prakash, Mukund Sridhar, Fabian Triefenbach, Apurv Verma, Gokhan Tur, Prem Natarajan

In this work, we demonstrate that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more efficient few-shot learners than decoder-only models on various tasks.

Ranked #14 on Natural Language Inference on CommitmentBank

Causal Language Modeling Common Sense Reasoning +8

363

Paper
Code

Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems

no code implementations • 15 Jun 2022 • Jack FitzGerald, Shankar Ananthakrishnan, Konstantine Arkoudas, Davide Bernardi, Abhishek Bhagia, Claudio Delli Bovi, Jin Cao, Rakesh Chada, Amit Chauhan, Luoxin Chen, Anurag Dwarakanath, Satyam Dwivedi, Turan Gojayev, Karthik Gopalakrishnan, Thomas Gueudre, Dilek Hakkani-Tur, Wael Hamza, Jonathan Hueser, Kevin Martin Jose, Haidar Khan, Beiye Liu, Jianhua Lu, Alessandro Manzotti, Pradeep Natarajan, Karolina Owczarzak, Gokmen Oz, Enrico Palumbo, Charith Peris, Chandana Satya Prakash, Stephen Rawls, Andy Rosenbaum, Anjali Shenoy, Saleh Soltan, Mukund Harakere Sridhar, Liz Tan, Fabian Triefenbach, Pan Wei, Haiyang Yu, Shuai Zheng, Gokhan Tur, Prem Natarajan

We present results from a large-scale experiment on pretraining encoders with non-embedding parameter counts ranging from 700M to 9. 3B, their subsequent distillation into smaller models ranging from 17M-170M parameters, and their application to the Natural Language Understanding (NLU) component of a virtual assistant system.

Cross-Lingual Natural Language Inference intent-classification +5

Paper
Add Code

Training Naturalized Semantic Parsers with Very Little Data

1 code implementation • 29 Apr 2022 • Subendhu Rongali, Konstantine Arkoudas, Melanie Rubino, Wael Hamza

Semantic parsing is an important NLP problem, particularly for voice assistants such as Alexa and Google Assistant.

Semantic Parsing

Paper
Code

Instilling Type Knowledge in Language Models via Multi-Task QA

1 code implementation • Findings (NAACL) 2022 • Shuyang Li, Mukund Sridhar, Chandana Satya Prakash, Jin Cao, Wael Hamza, Julian McAuley

Understanding human language often necessitates understanding entities and their place in a taxonomy of knowledge -- their types.

dialog state tracking Knowledge Graphs +1

Paper
Code

Contextual Domain Classification with Temporal Representations

no code implementations • NAACL 2021 • Tzu-Hsiang Lin, Yipeng Shi, Chentao Ye, Yang Fan, Weitong Ruan, Emre Barut, Wael Hamza, Chengwei Su

In commercial dialogue systems, the Spoken Language Understanding (SLU) component tends to have numerous domains thus context is needed to help resolve ambiguities.

Classification domain classification +1

Paper
Add Code

Zero-shot Generalization in Dialog State Tracking through Generative Question Answering

no code implementations • EACL 2021 • Shuyang Li, Jin Cao, Mukund Sridhar, Henghui Zhu, Shang-Wen Li, Wael Hamza, Julian McAuley

Dialog State Tracking (DST), an integral part of modern dialog systems, aims to track user preferences and constraints (slots) in task-oriented dialogs.

dialog state tracking Domain Adaptation +4

Paper
Add Code

Exploring Transfer Learning For End-to-End Spoken Language Understanding

no code implementations • 15 Dec 2020 • Subendhu Rongali, Beiye Liu, Liwei Cai, Konstantine Arkoudas, Chengwei Su, Wael Hamza

Since our model can process both speech and text input sequences and learn to predict a target sequence, it also allows us to do zero-shot E2E SLU by training on only text-hypothesis data (without any speech) from a new domain.

Ranked #3 on Spoken Language Understanding on Snips-SmartLights

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Delexicalized Paraphrase Generation

no code implementations • COLING 2020 • Boya Yu, Konstantine Arkoudas, Wael Hamza

We present a neural model for paraphrasing and train it to generate delexicalized sentences.

Data Augmentation intent-classification +6

Paper
Add Code

Multi-task Learning of Spoken Language Understanding by Integrating N-Best Hypotheses with Hierarchical Attention

no code implementations • COLING 2020 • Mingda Li, Xinyue Liu, Weitong Ruan, Luca Soldaini, Wael Hamza, Chengwei Su

The comparison shows that our model could recover the transcription by integrating the fragmented information among hypotheses and identifying the frequent error patterns of the ASR module, and even rewrite the query for a better understanding, which reveals the characteristic of multi-task learning of broadcasting knowledge.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Add Code

Style Attuned Pre-training and Parameter Efficient Fine-tuning for Spoken Language Understanding

no code implementations • 9 Oct 2020 • Jin Cao, Jun Wang, Wael Hamza, Kelly Vanee, Shang-Wen Li

The light encoder architecture separates the shared pre-trained networks from the mappings of generally encoded knowledge to specific domains of SLU, allowing for the domain adaptation to be performed solely at the light encoder and thus increasing efficiency.

Domain Adaptation Language Modelling +1

Paper
Add Code

Don't Parse, Insert: Multilingual Semantic Parsing with Insertion Based Decoding

no code implementations • CONLL 2020 • Qile Zhu, Haidar Khan, Saleh Soltan, Stephen Rawls, Wael Hamza

For complex parsing tasks, the state-of-the-art method is based on autoregressive sequence to sequence models to generate the parse directly.

Natural Language Understanding Semantic Parsing +4

Paper
Add Code

Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

no code implementations • 30 Jan 2020 • Subendhu Rongali, Luca Soldaini, Emilio Monti, Wael Hamza

Virtual assistants such as Amazon Alexa, Apple Siri, and Google Assistant often rely on a semantic parsing component to understand which action(s) to execute for an utterance spoken by its users.

Semantic Parsing slot-filling +1

Paper
Add Code

Improving Spoken Language Understanding By Exploiting ASR N-best Hypotheses

no code implementations • 11 Jan 2020 • Mingda Li, Weitong Ruan, Xinyue Liu, Luca Soldaini, Wael Hamza, Chengwei Su

The NLU module usually uses the first best interpretation of a given speech in downstream tasks such as domain and intent classification.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking

no code implementations • ACL 2018 • Gourab Kundu, Avirup Sil, Radu Florian, Wael Hamza

We propose an entity-centric neural cross-lingual coreference model that builds on multi-lingual embeddings and language-independent features.

coreference-resolution Entity Linking

Paper
Add Code

Leveraging Context Information for Natural Question Generation

1 code implementation • NAACL 2018 • Linfeng Song, Zhiguo Wang, Wael Hamza, Yue Zhang, Daniel Gildea

The task of natural question generation is to generate a corresponding question given the input passage (fact) and answer.

Ranked #11 on Question Generation on SQuAD1.1

Position Question Generation +1

Paper
Code

Neural Cross-Lingual Entity Linking

no code implementations • 5 Dec 2017 • Avirup Sil, Gourab Kundu, Radu Florian, Wael Hamza

A major challenge in Entity Linking (EL) is making effective use of contextual information to disambiguate mentions to Wikipedia that might refer to different entities in different contexts.

Ranked #3 on Entity Disambiguation on TAC2010

Cross-Lingual Entity Linking Entity Disambiguation +3

Paper
Add Code

A Unified Query-based Generative Model for Question Generation and Question Answering

no code implementations • 4 Sep 2017 • Linfeng Song, Zhiguo Wang, Wael Hamza

In the QG task, a question is generated from the system given the passage and the target answer, whereas in the QA task, the answer is generated given the question and the passage.

Question Answering Question Generation +1

Paper
Add Code

$k$-Nearest Neighbor Augmented Neural Networks for Text Classification

no code implementations • 25 Aug 2017 • Zhiguo Wang, Wael Hamza, Linfeng Song

However, it lacks the capacity of utilizing instance-level information from individual instances in the training set.

General Classification text-classification +2

Paper
Add Code

Reinforcement Learning for Transition-Based Mention Detection

no code implementations • 13 Mar 2017 • Georgiana Dinu, Wael Hamza, Radu Florian

This paper describes an application of reinforcement learning to the mention detection task.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Bilateral Multi-Perspective Matching for Natural Language Sentences

10 code implementations • 13 Feb 2017 • Zhiguo Wang, Wael Hamza, Radu Florian

Natural language sentence matching is a fundamental technology for a variety of tasks.

Ranked #17 on Paraphrase Identification on Quora Question Pairs (Accuracy metric)

Natural Language Inference Paraphrase Identification +1

530

Paper
Code

Multi-Perspective Context Matching for Machine Comprehension

1 code implementation • 13 Dec 2016 • Zhiguo Wang, Haitao Mi, Wael Hamza, Radu Florian

Based on this dataset, we propose a Multi-Perspective Context Matching (MPCM) model, which is an end-to-end system that directly predicts the answer beginning and ending points in a passage.

Ranked #3 on Open-Domain Question Answering on SQuAD1.1

Question Answering Reading Comprehension

216

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.