Search Results for author: Wael Hamza

Found 29 papers, 6 papers with code

Limitations of Knowledge Distillation for Zero-shot Transfer Learning

no code implementations EMNLP (sustainlp) 2021 Saleh Soltan, Haidar Khan, Wael Hamza

We demonstrate that in contradiction to the previous observation in the case of monolingual distillation, in multilingual settings, distillation during pretraining is more effective than distillation during fine-tuning for zero-shot transfer learning.

Knowledge Distillation Transfer Learning +1

Controlled Data Generation via Insertion Operations for NLU

no code implementations NAACL (ACL) 2022 Manoj Kumar, Yuval Merhav, Haidar Khan, Rahul Gupta, Anna Rumshisky, Wael Hamza

Use of synthetic data is rapidly emerging as a realistic alternative to manually annotating live traffic for industry-scale model building.

intent-classification Intent Classification +4

Attention Fusion: a light yet efficient late fusion mechanism for task adaptation in NLU

no code implementations Findings (NAACL) 2022 Jin Cao, Chandana Satya Prakash, Wael Hamza

However, given the trend of larger pre-trained models, fine-tuning these models for each downstream task is parameter-inefficient and computationally-expensive deeming this approach sub-optimal for adoption by NLU systems.

Language Modelling

Recipes for Sequential Pre-training of Multilingual Encoder and Seq2Seq Models

no code implementations14 Jun 2023 Saleh Soltan, Andy Rosenbaum, Tobias Falke, Qin Lu, Anna Rumshisky, Wael Hamza

(2) Conversely, using an encoder to warm-start seq2seq training, we show that by unfreezing the encoder partway through training, we can match task performance of a from-scratch seq2seq model.

Language Modelling Masked Language Modeling

Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data

no code implementations4 Apr 2023 Vladislav Lialin, Stephen Rawls, David Chan, Shalini Ghosh, Anna Rumshisky, Wael Hamza

Currently popular video-text data mining approach via automatic speech recognition (ASR) used in HowTo100M provides low-quality captions that often do not refer to the video content.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Low-Resource Compositional Semantic Parsing with Concept Pretraining

no code implementations24 Jan 2023 Subendhu Rongali, Mukund Sridhar, Haidar Khan, Konstantine Arkoudas, Wael Hamza, Andrew McCallum

In this work, we present an architecture to perform such domain adaptation automatically, with only a small amount of metadata about the new domain and without any new training data (zero-shot) or with very few examples (few-shot).

Domain Adaptation Semantic Parsing

CLASP: Few-Shot Cross-Lingual Data Augmentation for Semantic Parsing

no code implementations13 Oct 2022 Andy Rosenbaum, Saleh Soltan, Wael Hamza, Amir Saffari, Marco Damonte, Isabel Groves

A bottleneck to developing Semantic Parsing (SP) models is the need for a large volume of human-labeled training data.

Data Augmentation Semantic Parsing

LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging

no code implementations COLING 2022 Andy Rosenbaum, Saleh Soltan, Wael Hamza, Yannick Versley, Markus Boese

We present LINGUIST, a method for generating annotated data for Intent Classification and Slot Tagging (IC+ST), via fine-tuning AlexaTM 5B, a 5-billion-parameter multilingual sequence-to-sequence (seq2seq) model, on a flexible instruction prompt.

intent-classification Intent Classification +3

AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

1 code implementation2 Aug 2022 Saleh Soltan, Shankar Ananthakrishnan, Jack FitzGerald, Rahul Gupta, Wael Hamza, Haidar Khan, Charith Peris, Stephen Rawls, Andy Rosenbaum, Anna Rumshisky, Chandana Satya Prakash, Mukund Sridhar, Fabian Triefenbach, Apurv Verma, Gokhan Tur, Prem Natarajan

In this work, we demonstrate that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more efficient few-shot learners than decoder-only models on various tasks.

Causal Language Modeling Common Sense Reasoning +8

Training Naturalized Semantic Parsers with Very Little Data

1 code implementation29 Apr 2022 Subendhu Rongali, Konstantine Arkoudas, Melanie Rubino, Wael Hamza

Semantic parsing is an important NLP problem, particularly for voice assistants such as Alexa and Google Assistant.

Semantic Parsing

Contextual Domain Classification with Temporal Representations

no code implementations NAACL 2021 Tzu-Hsiang Lin, Yipeng Shi, Chentao Ye, Yang Fan, Weitong Ruan, Emre Barut, Wael Hamza, Chengwei Su

In commercial dialogue systems, the Spoken Language Understanding (SLU) component tends to have numerous domains thus context is needed to help resolve ambiguities.

Classification Spoken Language Understanding

Exploring Transfer Learning For End-to-End Spoken Language Understanding

no code implementations15 Dec 2020 Subendhu Rongali, Beiye Liu, Liwei Cai, Konstantine Arkoudas, Chengwei Su, Wael Hamza

Since our model can process both speech and text input sequences and learn to predict a target sequence, it also allows us to do zero-shot E2E SLU by training on only text-hypothesis data (without any speech) from a new domain.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Multi-task Learning of Spoken Language Understanding by Integrating N-Best Hypotheses with Hierarchical Attention

no code implementations COLING 2020 Mingda Li, Xinyue Liu, Weitong Ruan, Luca Soldaini, Wael Hamza, Chengwei Su

The comparison shows that our model could recover the transcription by integrating the fragmented information among hypotheses and identifying the frequent error patterns of the ASR module, and even rewrite the query for a better understanding, which reveals the characteristic of multi-task learning of broadcasting knowledge.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Style Attuned Pre-training and Parameter Efficient Fine-tuning for Spoken Language Understanding

no code implementations9 Oct 2020 Jin Cao, Jun Wang, Wael Hamza, Kelly Vanee, Shang-Wen Li

The light encoder architecture separates the shared pre-trained networks from the mappings of generally encoded knowledge to specific domains of SLU, allowing for the domain adaptation to be performed solely at the light encoder and thus increasing efficiency.

Domain Adaptation Language Modelling +1

Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

no code implementations30 Jan 2020 Subendhu Rongali, Luca Soldaini, Emilio Monti, Wael Hamza

Virtual assistants such as Amazon Alexa, Apple Siri, and Google Assistant often rely on a semantic parsing component to understand which action(s) to execute for an utterance spoken by its users.

Semantic Parsing slot-filling +1

Improving Spoken Language Understanding By Exploiting ASR N-best Hypotheses

no code implementations11 Jan 2020 Mingda Li, Weitong Ruan, Xinyue Liu, Luca Soldaini, Wael Hamza, Chengwei Su

The NLU module usually uses the first best interpretation of a given speech in downstream tasks such as domain and intent classification.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking

no code implementations ACL 2018 Gourab Kundu, Avirup Sil, Radu Florian, Wael Hamza

We propose an entity-centric neural cross-lingual coreference model that builds on multi-lingual embeddings and language-independent features.

coreference-resolution Entity Linking +1

Neural Cross-Lingual Entity Linking

no code implementations5 Dec 2017 Avirup Sil, Gourab Kundu, Radu Florian, Wael Hamza

A major challenge in Entity Linking (EL) is making effective use of contextual information to disambiguate mentions to Wikipedia that might refer to different entities in different contexts.

Cross-Lingual Entity Linking Entity Disambiguation +3

A Unified Query-based Generative Model for Question Generation and Question Answering

no code implementations4 Sep 2017 Linfeng Song, Zhiguo Wang, Wael Hamza

In the QG task, a question is generated from the system given the passage and the target answer, whereas in the QA task, the answer is generated given the question and the passage.

Question Answering Question Generation +1

$k$-Nearest Neighbor Augmented Neural Networks for Text Classification

no code implementations25 Aug 2017 Zhiguo Wang, Wael Hamza, Linfeng Song

However, it lacks the capacity of utilizing instance-level information from individual instances in the training set.

General Classification text-classification +2

Multi-Perspective Context Matching for Machine Comprehension

1 code implementation13 Dec 2016 Zhiguo Wang, Haitao Mi, Wael Hamza, Radu Florian

Based on this dataset, we propose a Multi-Perspective Context Matching (MPCM) model, which is an end-to-end system that directly predicts the answer beginning and ending points in a passage.

Question Answering Reading Comprehension +1

Cannot find the paper you are looking for? You can Submit a new open access paper.