Search Results for author: Yashar Mehdad

Found 58 papers, 15 papers with code

Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency

no code implementations5 Nov 2023 Sungho Jeon, Ching-Feng Yeh, Hakan Inan, Wei-Ning Hsu, Rashi Rungta, Yashar Mehdad, Daniel Bikel

In this paper, we show that a simple self-supervised pre-trained audio model can achieve comparable inference efficiency to more complicated pre-trained models with speech transformer encoders.


Effective Long-Context Scaling of Foundation Models

1 code implementation27 Sep 2023 Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma

We also examine the impact of various design choices in the pretraining process, including the data mix and the training curriculum of sequence lengths -- our ablation experiments suggest that having abundant long texts in the pretrain dataset is not the key to achieving strong performance, and we empirically verify that long context continual pretraining is more efficient and similarly effective compared to pretraining from scratch with long sequences.

Continual Pretraining Language Modelling

LLM-QAT: Data-Free Quantization Aware Training for Large Language Models

no code implementations29 May 2023 Zechun Liu, Barlas Oguz, Changsheng Zhao, Ernie Chang, Pierre Stock, Yashar Mehdad, Yangyang Shi, Raghuraman Krishnamoorthi, Vikas Chandra

Several post-training quantization methods have been applied to large language models (LLMs), and have been shown to perform well down to 8-bits.

Data Free Quantization

VideoOFA: Two-Stage Pre-Training for Video-to-Text Generation

no code implementations4 May 2023 Xilun Chen, Lili Yu, Wenhan Xiong, Barlas Oğuz, Yashar Mehdad, Wen-tau Yih

We propose a new two-stage pre-training framework for video-to-text generation tasks such as video captioning and video question answering: A generative encoder-decoder model is first jointly pre-trained on massive image-text data to learn fundamental vision-language concepts, and then adapted to video data in an intermediate video-text pre-training stage to learn video-specific skills such as spatio-temporal reasoning.

Question Answering Text Generation +3

How to Train Your DRAGON: Diverse Augmentation Towards Generalizable Dense Retrieval

1 code implementation15 Feb 2023 Sheng-Chieh Lin, Akari Asai, Minghan Li, Barlas Oguz, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen

We hence propose a new DA approach with diverse queries and sources of supervision to progressively train a generalizable DR. As a result, DRAGON, our dense retriever trained with diverse augmentation, is the first BERT-base-sized DR to achieve state-of-the-art effectiveness in both supervised and zero-shot evaluations and even competes with models using more complex late interaction (ColBERTv2 and SPLADE++).

Contrastive Learning Data Augmentation +1

STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension

no code implementations24 Dec 2022 Borui Wang, Chengcheng Feng, Arjun Nair, Madelyn Mao, Jai Desai, Asli Celikyilmaz, Haoran Li, Yashar Mehdad, Dragomir Radev

Abstractive dialogue summarization has long been viewed as an important standalone task in natural language processing, but no previous work has explored the possibility of whether abstractive dialogue summarization can also be used as a means to boost an NLP system's performance on other important dialogue comprehension tasks.

Abstractive Dialogue Summarization Question Answering

Improving Faithfulness of Abstractive Summarization by Controlling Confounding Effect of Irrelevant Sentences

no code implementations19 Dec 2022 Asish Ghoshal, Arash Einolghozati, Ankit Arun, Haoran Li, Lili Yu, Yashar Mehdad, Scott Wen-tau Yih, Asli Celikyilmaz

Lack of factual correctness is an issue that still plagues state-of-the-art summarization systems despite their impressive progress on generating seemingly fluent summaries.

Abstractive Text Summarization

CITADEL: Conditional Token Interaction via Dynamic Lexical Routing for Efficient and Effective Multi-Vector Retrieval

1 code implementation18 Nov 2022 Minghan Li, Sheng-Chieh Lin, Barlas Oguz, Asish Ghoshal, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen

In this paper, we unify different multi-vector retrieval models from a token routing viewpoint and propose conditional token interaction via dynamic lexical routing, namely CITADEL, for efficient and effective multi-vector retrieval.


Bridging the Training-Inference Gap for Dense Phrase Retrieval

no code implementations25 Oct 2022 Gyuwan Kim, Jinhyuk Lee, Barlas Oguz, Wenhan Xiong, Yizhe Zhang, Yashar Mehdad, William Yang Wang

Building dense retrievers requires a series of standard procedures, including training and validating neural models and creating indexes for efficient search.

Open-Domain Question Answering Passage Retrieval +1

Structured Summarization: Unified Text Segmentation and Segment Labeling as a Generation Task

no code implementations28 Sep 2022 Hakan Inan, Rashi Rungta, Yashar Mehdad

In this work, we propose a single encoder-decoder neural network that can handle long documents and conversations, trained simultaneously for both segmentation and segment labeling using only standard supervision.

Segmentation Text Segmentation

BiT: Robustly Binarized Multi-distilled Transformer

2 code implementations25 May 2022 Zechun Liu, Barlas Oguz, Aasish Pappu, Lin Xiao, Scott Yih, Meng Li, Raghuraman Krishnamoorthi, Yashar Mehdad

Modern pre-trained transformers have rapidly advanced the state-of-the-art in machine learning, but have also grown in parameters and computational complexity, making them increasingly difficult to deploy in resource-constrained environments.


Salient Phrase Aware Dense Retrieval: Can a Dense Retriever Imitate a Sparse One?

2 code implementations13 Oct 2021 Xilun Chen, Kushal Lakhotia, Barlas Oğuz, Anchit Gupta, Patrick Lewis, Stan Peshterliev, Yashar Mehdad, Sonal Gupta, Wen-tau Yih

Despite their recent popularity and well-known advantages, dense retrievers still lag behind sparse methods such as BM25 in their ability to reliably match salient phrases and rare entities in the query and to generalize to out-of-domain data.

Open-Domain Question Answering Passage Retrieval +1

Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries

no code implementations NAACL 2022 Xiangru Tang, Alexander Fabbri, Haoran Li, Ziming Mao, Griffin Thomas Adams, Borui Wang, Asli Celikyilmaz, Yashar Mehdad, Dragomir Radev

Current pre-trained models applied to summarization are prone to factual inconsistencies which either misrepresent the source text or introduce extraneous information.

Syntax-augmented Multilingual BERT for Cross-lingual Transfer

1 code implementation ACL 2021 Wasi Uddin Ahmad, Haoran Li, Kai-Wei Chang, Yashar Mehdad

In recent years, we have seen a colossal effort in pre-training multilingual text encoders using large-scale corpora in many languages to facilitate cross-lingual transfer learning.

Cross-Lingual Transfer named-entity-recognition +7

ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive Summarization with Argument Mining

1 code implementation ACL 2021 Alexander R. Fabbri, Faiaz Rahman, Imad Rizvi, Borui Wang, Haoran Li, Yashar Mehdad, Dragomir Radev

While online conversations can cover a vast amount of information in many different formats, abstractive text summarization has primarily focused on modeling solely news articles.

Abstractive Text Summarization Argument Mining +2

RECONSIDER: Improved Re-Ranking using Span-Focused Cross-Attention for Open Domain Question Answering

no code implementations NAACL 2021 Srinivasan Iyer, Sewon Min, Yashar Mehdad, Wen-tau Yih

State-of-the-art Machine Reading Comprehension (MRC) models for Open-domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples.

Machine Reading Comprehension Natural Questions +3

EASE: Extractive-Abstractive Summarization with Explanations

no code implementations14 May 2021 Haoran Li, Arash Einolghozati, Srinivasan Iyer, Bhargavi Paranjape, Yashar Mehdad, Sonal Gupta, Marjan Ghazvininejad

Current abstractive summarization systems outperform their extractive counterparts, but their widespread adoption is inhibited by the inherent lack of interpretability.

Abstractive Text Summarization Document Summarization +1

FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation

no code implementations EMNLP 2021 Kushal Lakhotia, Bhargavi Paranjape, Asish Ghoshal, Wen-tau Yih, Yashar Mehdad, Srinivasan Iyer

Natural language (NL) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black-box pre-trained models, for NLP tasks such as Question Answering (QA) and Fact Verification.

Fact Verification Question Answering

Human Evaluation of Spoken vs. Visual Explanations for Open-Domain QA

no code implementations30 Dec 2020 Ana Valeria Gonzalez, Gagan Bansal, Angela Fan, Robin Jia, Yashar Mehdad, Srinivasan Iyer

While research on explaining predictions of open-domain QA systems (ODQA) to users is gaining momentum, most works have failed to evaluate the extent to which explanations improve user trust.

Towards Understanding the Behaviors of Optimal Deep Active Learning Algorithms

1 code implementation29 Dec 2020 Yilun Zhou, Adithya Renduchintala, Xian Li, Sida Wang, Yashar Mehdad, Asish Ghoshal

Active learning (AL) algorithms may achieve better performance with fewer data because the model guides the data selection process.

Active Learning

RECONSIDER: Re-Ranking using Span-Focused Cross-Attention for Open Domain Question Answering

1 code implementation21 Oct 2020 Srinivasan Iyer, Sewon Min, Yashar Mehdad, Wen-tau Yih

State-of-the-art Machine Reading Comprehension (MRC) models for Open-domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples.

Machine Reading Comprehension Natural Questions +3

Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic Parsing

no code implementations EMNLP 2020 Xilun Chen, Asish Ghoshal, Yashar Mehdad, Luke Zettlemoyer, Sonal Gupta

Task-oriented semantic parsing is a critical component of virtual assistants, which is responsible for understanding the user's intents (set reminder, play music, etc.).

Domain Adaptation Meta-Learning +2

Efficient One-Pass End-to-End Entity Linking for Questions

3 code implementations EMNLP 2020 Belinda Z. Li, Sewon Min, Srinivasan Iyer, Yashar Mehdad, Wen-tau Yih

We present ELQ, a fast end-to-end entity linking model for questions, which uses a biencoder to jointly perform mention detection and linking in one pass.

Entity Linking Question Answering

Conversational Semantic Parsing

no code implementations EMNLP 2020 Armen Aghajanyan, Jean Maillard, Akshat Shrivastava, Keith Diedrick, Mike Haeger, Haoran Li, Yashar Mehdad, Ves Stoyanov, Anuj Kumar, Mike Lewis, Sonal Gupta

In this paper, we propose a semantic representation for such task-oriented conversational systems that can represent concepts such as co-reference and context carryover, enabling comprehensive understanding of queries in a session.

dialog state tracking Semantic Parsing

Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval

1 code implementation ICLR 2021 Wenhan Xiong, Xiang Lorraine Li, Srini Iyer, Jingfei Du, Patrick Lewis, William Yang Wang, Yashar Mehdad, Wen-tau Yih, Sebastian Riedel, Douwe Kiela, Barlas Oğuz

We propose a simple and efficient multi-hop dense retrieval approach for answering complex open-domain questions, which achieves state-of-the-art performance on two multi-hop datasets, HotpotQA and multi-evidence FEVER.

Question Answering Retrieval

MTOP: A Comprehensive Multilingual Task-Oriented Semantic Parsing Benchmark

no code implementations EACL 2021 Haoran Li, Abhinav Arora, Shuohui Chen, Anchit Gupta, Sonal Gupta, Yashar Mehdad

Scaling semantic parsing models for task-oriented dialog systems to new languages is often expensive and time-consuming due to the lack of available datasets.

Benchmarking Semantic Parsing +1

DocTag2Vec: An Embedding Based Multi-label Learning Approach for Document Tagging

no code implementations WS 2017 Sheng Chen, Akshay Soni, Aasish Pappu, Yashar Mehdad

Tagging news articles or blog posts with relevant tags from a collection of predefined ones is coined as document tagging in this work.

Multi-Label Learning TAG

Online Article Ranking as a Constrained, Dynamic, Multi-Objective Optimization Problem

no code implementations16 May 2017 Jeya Balaji Balasubramanian, Akshay Soni, Yashar Mehdad, Nikolay Laptev

The content ranking problem in a social news website, is typically a function that maximizes a scalar metric of interest like dwell-time.

Rank-to-engage: New Listwise Approaches to Maximize Engagement

no code implementations24 Feb 2017 Swayambhoo Jain, Akshay Soni, Nikolay Laptev, Yashar Mehdad

For many internet businesses, presenting a given list of items in an order that maximizes a certain metric of interest (e. g., click-through-rate, average engagement time etc.)


RIPML: A Restricted Isometry Property based Approach to Multilabel Learning

no code implementations16 Feb 2017 Akshay Soni, Yashar Mehdad

The multilabel learning problem with large number of labels, features, and data-points has generated a tremendous interest recently.

Dimensionality Reduction

Domain Adaptation for Named Entity Recognition in Online Media with Word Embeddings

no code implementations1 Dec 2016 Vivek Kulkarni, Yashar Mehdad, Troy Chevalier

In this paper, we propose methods to effectively adapt models learned on one domain onto other domains using distributed word representations.

Domain Adaptation named-entity-recognition +3

Extractive Summarization under Strict Length Constraints

no code implementations LREC 2016 Yashar Mehdad, Am Stent, a, Kapil Thadani, Dragomir Radev, Youssef Billawala, Karolina Buchner

In this paper we report a comparison of various techniques for single-document extractive summarization under strict length budgets, which is a common commercial use case (e. g. summarization of news articles by news aggregators).

Extractive Summarization

Cannot find the paper you are looking for? You can Submit a new open access paper.