no code implementations • EMNLP (newsum) 2021 • Haoran Li, Arash Einolghozati, Srinivasan Iyer, Bhargavi Paranjape, Yashar Mehdad, Sonal Gupta, Marjan Ghazvininejad
To achieve the best of both worlds, we propose EASE, an extractive-abstractive framework that generates concise abstractive summaries that can be traced back to an extractive summary.
1 code implementation • ICML 2020 • Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
State-of-the-art neural machine translation models generate a translation from left to right and every step is conditioned on the previously generated tokens.
Ranked #54 on Machine Translation on WMT2014 English-German
no code implementations • 15 Aug 2024 • Xiaochuang Han, Marjan Ghazvininejad, Pang Wei Koh, Yulia Tsvetkov
Evaluation of image generation shows that this simple and straightforward approach is more effective than pixel-based modeling and sophisticated vector quantization baselines (on which our method yields a 31% reduction in FID).
no code implementations • 24 May 2023 • Xiaochuang Han, Sachin Kumar, Yulia Tsvetkov, Marjan Ghazvininejad
Diffusion-based language models are emerging as a promising alternative to autoregressive LMs: they approach the competence of autoregressive LMs while offering nuanced controllability at inference time.
no code implementations • 15 Feb 2023 • Marjan Ghazvininejad, Hila Gonen, Luke Zettlemoyer
Large language models (LLMs) demonstrate remarkable machine translation (MT) abilities via prompting, even though they were not explicitly trained for this task.
1 code implementation • 4 Feb 2023 • Yu Meng, Jitin Krishnan, Sinong Wang, Qifan Wang, Yuning Mao, Han Fang, Marjan Ghazvininejad, Jiawei Han, Luke Zettlemoyer
In this work, we offer a new perspective on the consequence of such a discrepancy: We demonstrate empirically and theoretically that MLM pretraining allocates some model dimensions exclusively for representing $\texttt{[MASK]}$ tokens, resulting in a representation deficiency for real tokens and limiting the pretrained model's expressiveness when it is adapted to downstream data without $\texttt{[MASK]}$ tokens.
2 code implementations • 25 Jan 2023 • Davis Liang, Hila Gonen, Yuning Mao, Rui Hou, Naman Goyal, Marjan Ghazvininejad, Luke Zettlemoyer, Madian Khabsa
Large multilingual language models typically rely on a single vocabulary shared across 100+ languages.
2 code implementations • 5 Dec 2022 • Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, Marjan Ghazvininejad
Large-scale generative models show an impressive ability to perform a wide range of Natural Language Processing (NLP) tasks using in-context learning, where a few examples are used to describe a task to the model.
1 code implementation • 25 Apr 2022 • Freda Shi, Daniel Fried, Marjan Ghazvininejad, Luke Zettlemoyer, Sida I. Wang
In this work, we introduce execution result--based minimum Bayes risk decoding (MBR-EXEC) for program selection and show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks.
Ranked #38 on Code Generation on MBPP
no code implementations • 12 Apr 2022 • Badr AlKhamissi, Millicent Li, Asli Celikyilmaz, Mona Diab, Marjan Ghazvininejad
Recently, there has been a surge of interest in the NLP community on the use of pretrained Language Models (LMs) as Knowledge Bases (KBs).
no code implementations • 10 Dec 2021 • Marjan Ghazvininejad, Vladimir Karpukhin, Vera Gor, Asli Celikyilmaz
We show that soft-prompt based conditional text generation can be improved with simple and efficient methods that simulate modeling the discourse structure of human written text.
no code implementations • Findings (NAACL) 2022 • Eleftheria Briakou, Sida I. Wang, Luke Zettlemoyer, Marjan Ghazvininejad
Mined bitexts can contain imperfect translations that yield unreliable training signals for Neural Machine Translation (NMT).
1 code implementation • EMNLP 2021 • Chunting Zhou, Daniel Levy, Xian Li, Marjan Ghazvininejad, Graham Neubig
Multilingual neural machine translation (MNMT) learns to translate multiple language pairs with a single model, potentially improving both the accuracy and the memory-efficiency of deployed models.
no code implementations • Findings (ACL) 2021 • Bhargavi Paranjape, Julian Michael, Marjan Ghazvininejad, Luke Zettlemoyer, Hannaneh Hajishirzi
Many commonsense reasoning NLP tasks involve choosing between one or more possible answers to a question or prompt based on knowledge that is often implicit.
no code implementations • 14 May 2021 • Haoran Li, Arash Einolghozati, Srinivasan Iyer, Bhargavi Paranjape, Yashar Mehdad, Sonal Gupta, Marjan Ghazvininejad
Current abstractive summarization systems outperform their extractive counterparts, but their widespread adoption is inhibited by the inherent lack of interpretability.
1 code implementation • NAACL 2021 • Arun Babu, Akshat Shrivastava, Armen Aghajanyan, Ahmed Aly, Angela Fan, Marjan Ghazvininejad
Semantic parsing using sequence-to-sequence models allows parsing of deeper representations compared to traditional word tagging based models.
2 code implementations • Findings (ACL) 2021 • Chunting Zhou, Graham Neubig, Jiatao Gu, Mona Diab, Paco Guzman, Luke Zettlemoyer, Marjan Ghazvininejad
Neural sequence models can generate highly fluent sentences, but recent studies have also shown that they are also prone to hallucinate additional content not supported by the input.
no code implementations • NAACL 2021 • Alexander R. Fabbri, Simeng Han, Haoyuan Li, Haoran Li, Marjan Ghazvininejad, Shafiq Joty, Dragomir Radev, Yashar Mehdad
Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on English text summarization tasks.
2 code implementations • ICLR 2021 • Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
We introduce a deep and light-weight transformer, DeLighT, that delivers similar or better performance than standard transformer-based models with significantly fewer parameters.
Ranked #1 on Machine Translation on WMT2016 English-French
no code implementations • ACL 2020 • Nabil Hossain, Marjan Ghazvininejad, Luke Zettlemoyer
Retrieve-and-edit seq2seq methods typically retrieve an output from the training set and learn a model to edit it to produce the final output.
2 code implementations • NeurIPS 2020 • Mike Lewis, Marjan Ghazvininejad, Gargi Ghosh, Armen Aghajanyan, Sida Wang, Luke Zettlemoyer
The objective noisily captures aspects of paraphrase, translation, multi-document summarization, and information retrieval, allowing for strong zero-shot performance on several tasks.
no code implementations • EACL 2021 • Asa Cooper Stickland, Xi-An Li, Marjan Ghazvininejad
For BART we get the best performance by freezing most of the model parameters, and adding extra positional embeddings.
1 code implementation • ICML 2020 • Marjan Ghazvininejad, Vladimir Karpukhin, Luke Zettlemoyer, Omer Levy
This difficultly is compounded during training with cross entropy loss, which can highly penalize small shifts in word order.
no code implementations • 23 Jan 2020 • Marjan Ghazvininejad, Omer Levy, Luke Zettlemoyer
The recently proposed mask-predict decoding algorithm has narrowed the performance gap between semi-autoregressive machine translation models and the traditional left-to-right approach.
7 code implementations • 22 Jan 2020 • Yinhan Liu, Jiatao Gu, Naman Goyal, Xi-An Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer
This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks.
1 code implementation • 15 Jan 2020 • Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
State-of-the-art neural machine translation models generate a translation from left to right and every step is conditioned on the previously generated tokens.
44 code implementations • ACL 2020 • Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdel-rahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer
We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.
Ranked #3 on Open-Domain Question Answering on ELI5
no code implementations • ACL 2019 • Nima Pourdamghani, Nada Aldarrab, Marjan Ghazvininejad, Kevin Knight, Jonathan May
Given a rough, word-by-word gloss of a source language sentence, target language natives can uncover the latent, fully-fluent rendering of the translation.
2 code implementations • IJCNLP 2019 • Marjan Ghazvininejad, Omer Levy, Yinhan Liu, Luke Zettlemoyer
Most machine translation systems generate text autoregressively from left to right.
no code implementations • WS 2019 • Vladimir Karpukhin, Omer Levy, Jacob Eisenstein, Marjan Ghazvininejad
We consider the problem of making machine translation more robust to character-level variation at the source side, such as typos.
no code implementations • WS 2018 • Nanyun Peng, Marjan Ghazvininejad, Jonathan May, Kevin Knight
We present a general framework of analyzing existing story corpora to generate controllable and creative new stories.
no code implementations • NAACL 2018 • Nima Pourdamghani, Marjan Ghazvininejad, Kevin Knight
We present a method for improving word alignments using word similarities.
no code implementations • NAACL 2018 • Marjan Ghazvininejad, Yejin Choi, Kevin Knight
We present the first neural poetry translation system.
2 code implementations • 7 Feb 2017 • Marjan Ghazvininejad, Chris Brockett, Ming-Wei Chang, Bill Dolan, Jianfeng Gao, Wen-tau Yih, Michel Galley
We generalize the widely-used Seq2Seq approach by conditioning responses on both conversation history and external "facts", allowing the model to be versatile and applicable in an open-domain setting.
no code implementations • 7 May 2015 • Bharath Sankaran, Marjan Ghazvininejad, Xinran He, David Kale, Liron Cohen
Set functions, and specifically submodular set functions, characterize a wide variety of naturally occurring optimization problems, and the property of submodularity of set functions has deep theoretical consequences with wide ranging applications.
no code implementations • CVPR 2013 • Amirreza Shaban, Hamid R. Rabiee, Mehrdad Farajtabar, Marjan Ghazvininejad
Exploiting the local similarity of a descriptor and its nearby bases, a global measure of association of a descriptor to all the bases is computed.