no code implementations • EMNLP (newsum) 2021 • Haoran Li, Arash Einolghozati, Srinivasan Iyer, Bhargavi Paranjape, Yashar Mehdad, Sonal Gupta, Marjan Ghazvininejad
To achieve the best of both worlds, we propose EASE, an extractive-abstractive framework that generates concise abstractive summaries that can be traced back to an extractive summary.
no code implementations • 5 Nov 2023 • Sungho Jeon, Ching-Feng Yeh, Hakan Inan, Wei-Ning Hsu, Rashi Rungta, Yashar Mehdad, Daniel Bikel
In this paper, we show that a simple self-supervised pre-trained audio model can achieve comparable inference efficiency to more complicated pre-trained models with speech transformer encoders.
2 code implementations • 27 Sep 2023 • Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma
We also examine the impact of various design choices in the pretraining process, including the data mix and the training curriculum of sequence lengths -- our ablation experiments suggest that having abundant long texts in the pretrain dataset is not the key to achieving strong performance, and we empirically verify that long context continual pretraining is more efficient and similarly effective compared to pretraining from scratch with long sequences.
3 code implementations • 29 May 2023 • Zechun Liu, Barlas Oguz, Changsheng Zhao, Ernie Chang, Pierre Stock, Yashar Mehdad, Yangyang Shi, Raghuraman Krishnamoorthi, Vikas Chandra
Several post-training quantization methods have been applied to large language models (LLMs), and have been shown to perform well down to 8-bits.
no code implementations • 4 May 2023 • Xilun Chen, Lili Yu, Wenhan Xiong, Barlas Oğuz, Yashar Mehdad, Wen-tau Yih
We propose a new two-stage pre-training framework for video-to-text generation tasks such as video captioning and video question answering: A generative encoder-decoder model is first jointly pre-trained on massive image-text data to learn fundamental vision-language concepts, and then adapted to video data in an intermediate video-text pre-training stage to learn video-specific skills such as spatio-temporal reasoning.
1 code implementation • 15 Feb 2023 • Sheng-Chieh Lin, Akari Asai, Minghan Li, Barlas Oguz, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen
We hence propose a new DA approach with diverse queries and sources of supervision to progressively train a generalizable DR. As a result, DRAGON, our dense retriever trained with diverse augmentation, is the first BERT-base-sized DR to achieve state-of-the-art effectiveness in both supervised and zero-shot evaluations and even competes with models using more complex late interaction (ColBERTv2 and SPLADE++).
no code implementations • 24 Dec 2022 • Borui Wang, Chengcheng Feng, Arjun Nair, Madelyn Mao, Jai Desai, Asli Celikyilmaz, Haoran Li, Yashar Mehdad, Dragomir Radev
Abstractive dialogue summarization has long been viewed as an important standalone task in natural language processing, but no previous work has explored the possibility of whether abstractive dialogue summarization can also be used as a means to boost an NLP system's performance on other important dialogue comprehension tasks.
no code implementations • 19 Dec 2022 • Asish Ghoshal, Arash Einolghozati, Ankit Arun, Haoran Li, Lili Yu, Vera Gor, Yashar Mehdad, Scott Wen-tau Yih, Asli Celikyilmaz
Lack of factual correctness is an issue that still plagues state-of-the-art summarization systems despite their impressive progress on generating seemingly fluent summaries.
1 code implementation • 18 Nov 2022 • Minghan Li, Sheng-Chieh Lin, Barlas Oguz, Asish Ghoshal, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen
In this paper, we unify different multi-vector retrieval models from a token routing viewpoint and propose conditional token interaction via dynamic lexical routing, namely CITADEL, for efficient and effective multi-vector retrieval.
no code implementations • 25 Oct 2022 • Gyuwan Kim, Jinhyuk Lee, Barlas Oguz, Wenhan Xiong, Yizhe Zhang, Yashar Mehdad, William Yang Wang
Building dense retrievers requires a series of standard procedures, including training and validating neural models and creating indexes for efficient search.
no code implementations • 28 Sep 2022 • Hakan Inan, Rashi Rungta, Yashar Mehdad
In this work, we propose a single encoder-decoder neural network that can handle long documents and conversations, trained simultaneously for both segmentation and segment labeling using only standard supervision.
1 code implementation • 21 Sep 2022 • Wenhan Xiong, Anchit Gupta, Shubham Toshniwal, Yashar Mehdad, Wen-tau Yih
We present an empirical study of adapting an existing pretrained text-to-text model for long-sequence inputs.
Ranked #1 on Text Summarization on QMSum
3 code implementations • 25 May 2022 • Zechun Liu, Barlas Oguz, Aasish Pappu, Lin Xiao, Scott Yih, Meng Li, Raghuraman Krishnamoorthi, Yashar Mehdad
Modern pre-trained transformers have rapidly advanced the state-of-the-art in machine learning, but have also grown in parameters and computational complexity, making them increasingly difficult to deploy in resource-constrained environments.
no code implementations • NAACL 2022 • Xiangru Tang, Arjun Nair, Borui Wang, Bingyao Wang, Jai Desai, Aaron Wade, Haoran Li, Asli Celikyilmaz, Yashar Mehdad, Dragomir Radev
Using human evaluation and automatic faithfulness metrics, we show that our model significantly reduces all kinds of factual errors on the dialogue summarization, SAMSum corpus.
1 code implementation • NAACL 2022 • Wenhan Xiong, Barlas Oğuz, Anchit Gupta, Xilun Chen, Diana Liskovich, Omer Levy, Wen-tau Yih, Yashar Mehdad
Many NLP tasks require processing long contexts beyond the length limit of pretrained models.
2 code implementations • 13 Oct 2021 • Xilun Chen, Kushal Lakhotia, Barlas Oğuz, Anchit Gupta, Patrick Lewis, Stan Peshterliev, Yashar Mehdad, Sonal Gupta, Wen-tau Yih
Despite their recent popularity and well-known advantages, dense retrievers still lag behind sparse methods such as BM25 in their ability to reliably match salient phrases and rare entities in the query and to generalize to out-of-domain data.
Ranked #2 on Passage Retrieval on EntityQuestions
no code implementations • NAACL 2022 • Xiangru Tang, Alexander Fabbri, Haoran Li, Ziming Mao, Griffin Thomas Adams, Borui Wang, Asli Celikyilmaz, Yashar Mehdad, Dragomir Radev
Current pre-trained models applied to summarization are prone to factual inconsistencies which either misrepresent the source text or introduce extraneous information.
1 code implementation • Findings (NAACL) 2022 • Barlas Oğuz, Kushal Lakhotia, Anchit Gupta, Patrick Lewis, Vladimir Karpukhin, Aleksandra Piktus, Xilun Chen, Sebastian Riedel, Wen-tau Yih, Sonal Gupta, Yashar Mehdad
Pre-training on larger datasets with ever increasing model size is now a proven recipe for increased performance across almost all NLP tasks.
Ranked #2 on Passage Retrieval on Natural Questions (using extra training data)
1 code implementation • ACL 2021 • Wasi Uddin Ahmad, Haoran Li, Kai-Wei Chang, Yashar Mehdad
In recent years, we have seen a colossal effort in pre-training multilingual text encoders using large-scale corpora in many languages to facilitate cross-lingual transfer learning.
no code implementations • NAACL 2021 • Srinivasan Iyer, Sewon Min, Yashar Mehdad, Wen-tau Yih
State-of-the-art Machine Reading Comprehension (MRC) models for Open-domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples.
1 code implementation • ACL 2021 • Alexander R. Fabbri, Faiaz Rahman, Imad Rizvi, Borui Wang, Haoran Li, Yashar Mehdad, Dragomir Radev
While online conversations can cover a vast amount of information in many different formats, abstractive text summarization has primarily focused on modeling solely news articles.
no code implementations • 14 May 2021 • Haoran Li, Arash Einolghozati, Srinivasan Iyer, Bhargavi Paranjape, Yashar Mehdad, Sonal Gupta, Marjan Ghazvininejad
Current abstractive summarization systems outperform their extractive counterparts, but their widespread adoption is inhibited by the inherent lack of interpretability.
no code implementations • ICLR 2021 • Asish Ghoshal, Xilun Chen, Sonal Gupta, Luke Zettlemoyer, Yashar Mehdad
Training with soft targets instead of hard targets has been shown to improve performance and calibration of deep neural networks.
no code implementations • 1 Jan 2021 • Sewon Min, Jordan Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini, Nicola De Cao, Edouard Grave, Ikuya Yamada, Sonse Shimaoka, Masatoshi Suzuki, Shumpei Miyawaki, Shun Sato, Ryo Takahashi, Jun Suzuki, Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz, Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Wen-tau Yih
We review the EfficientQA competition from NeurIPS 2020.
no code implementations • EMNLP 2021 • Kushal Lakhotia, Bhargavi Paranjape, Asish Ghoshal, Wen-tau Yih, Yashar Mehdad, Srinivasan Iyer
Natural language (NL) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black-box pre-trained models, for NLP tasks such as Question Answering (QA) and Fact Verification.
no code implementations • 30 Dec 2020 • Ana Valeria Gonzalez, Gagan Bansal, Angela Fan, Robin Jia, Yashar Mehdad, Srinivasan Iyer
While research on explaining predictions of open-domain QA systems (ODQA) to users is gaining momentum, most works have failed to evaluate the extent to which explanations improve user trust.
1 code implementation • Findings (NAACL) 2022 • Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Scott Yih
We study open-domain question answering with structured, unstructured and semi-structured knowledge sources, including text, tables, lists and knowledge bases.
Ranked #1 on Open-Domain Question Answering on WebQuestions (using extra training data)
Knowledge Base Question Answering Open-Domain Question Answering
1 code implementation • 29 Dec 2020 • Yilun Zhou, Adithya Renduchintala, Xian Li, Sida Wang, Yashar Mehdad, Asish Ghoshal
Active learning (AL) algorithms may achieve better performance with fewer data because the model guides the data selection process.
no code implementations • NAACL 2021 • Alexander R. Fabbri, Simeng Han, Haoyuan Li, Haoran Li, Marjan Ghazvininejad, Shafiq Joty, Dragomir Radev, Yashar Mehdad
Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on English text summarization tasks.
1 code implementation • 21 Oct 2020 • Srinivasan Iyer, Sewon Min, Yashar Mehdad, Wen-tau Yih
State-of-the-art Machine Reading Comprehension (MRC) models for Open-domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples.
1 code implementation • EMNLP 2020 • Xilun Chen, Asish Ghoshal, Yashar Mehdad, Luke Zettlemoyer, Sonal Gupta
Task-oriented semantic parsing is a critical component of virtual assistants, which is responsible for understanding the user's intents (set reminder, play music, etc.).
3 code implementations • EMNLP 2020 • Belinda Z. Li, Sewon Min, Srinivasan Iyer, Yashar Mehdad, Wen-tau Yih
We present ELQ, a fast end-to-end entity linking model for questions, which uses a biencoder to jointly perform mention detection and linking in one pass.
no code implementations • EMNLP 2020 • Armen Aghajanyan, Jean Maillard, Akshat Shrivastava, Keith Diedrick, Mike Haeger, Haoran Li, Yashar Mehdad, Ves Stoyanov, Anuj Kumar, Mike Lewis, Sonal Gupta
In this paper, we propose a semantic representation for such task-oriented conversational systems that can represent concepts such as co-reference and context carryover, enabling comprehensive understanding of queries in a session.
1 code implementation • ICLR 2021 • Wenhan Xiong, Xiang Lorraine Li, Srini Iyer, Jingfei Du, Patrick Lewis, William Yang Wang, Yashar Mehdad, Wen-tau Yih, Sebastian Riedel, Douwe Kiela, Barlas Oğuz
We propose a simple and efficient multi-hop dense retrieval approach for answering complex open-domain questions, which achieves state-of-the-art performance on two multi-hop datasets, HotpotQA and multi-evidence FEVER.
Ranked #14 on Question Answering on HotpotQA
no code implementations • EACL 2021 • Haoran Li, Abhinav Arora, Shuohui Chen, Anchit Gupta, Sonal Gupta, Yashar Mehdad
Scaling semantic parsing models for task-oriented dialog systems to new languages is often expensive and time-consuming due to the lack of available datasets.
no code implementations • WS 2017 • Sheng Chen, Akshay Soni, Aasish Pappu, Yashar Mehdad
Tagging news articles or blog posts with relevant tags from a collection of predefined ones is coined as document tagging in this work.
no code implementations • 16 May 2017 • Jeya Balaji Balasubramanian, Akshay Soni, Yashar Mehdad, Nikolay Laptev
The content ranking problem in a social news website, is typically a function that maximizes a scalar metric of interest like dwell-time.
no code implementations • 24 Feb 2017 • Swayambhoo Jain, Akshay Soni, Nikolay Laptev, Yashar Mehdad
For many internet businesses, presenting a given list of items in an order that maximizes a certain metric of interest (e. g., click-through-rate, average engagement time etc.)
no code implementations • 16 Feb 2017 • Akshay Soni, Yashar Mehdad
The multilabel learning problem with large number of labels, features, and data-points has generated a tremendous interest recently.
no code implementations • 1 Dec 2016 • Vivek Kulkarni, Yashar Mehdad, Troy Chevalier
In this paper, we propose methods to effectively adapt models learned on one domain onto other domains using distributed word representations.
no code implementations • LREC 2016 • Yashar Mehdad, Am Stent, a, Kapil Thadani, Dragomir Radev, Youssef Billawala, Karolina Buchner
In this paper we report a comparison of various techniques for single-document extractive summarization under strict length budgets, which is a common commercial use case (e. g. summarization of news articles by news aggregators).
no code implementations • LREC 2012 • Matteo Negri, Yashar Mehdad, Aless Marchetti, ro, Danilo Giampiccolo, Luisa Bentivogli
We present a framework for the acquisition of sentential paraphrases based on crowdsourcing.