1 code implementation • EACL (AdaptNLP) 2021 • Mandy Guo, Yinfei Yang, Daniel Cer, Qinlan Shen, Noah Constant
Retrieval question answering (ReQA) is the task of retrieving a sentence-level answer to a question from an open corpus (Ahmad et al., 2019). This dataset paper presents MultiReQA, a new multi-domain ReQA evaluation suite composed of eight retrieval QA tasks drawn from publicly available QA datasets.
no code implementations • Findings (EMNLP) 2021 • Aashi Jain, Mandy Guo, Krishna Srinivasan, Ting Chen, Sneha Kudugunta, Chao Jia, Yinfei Yang, Jason Baldridge
Both image-caption pairs and translation pairs provide the means to learn deep representations of and connections between languages.
1 code implementation • 18 May 2023 • David Uthus, Santiago Ontañón, Joshua Ainslie, Mandy Guo
We present our work on developing a multilingual, efficient text-to-text transformer that is suitable for handling long inputs.
2 code implementations • 9 May 2023 • Andrea Burns, Krishna Srinivasan, Joshua Ainslie, Geoff Brown, Bryan A. Plummer, Kate Saenko, Jianmo Ni, Mandy Guo
Webpages have been a rich resource for language and vision-language tasks.
1 code implementation • 5 May 2023 • Andrea Burns, Krishna Srinivasan, Joshua Ainslie, Geoff Brown, Bryan A. Plummer, Kate Saenko, Jianmo Ni, Mandy Guo
Webpages have been a rich, scalable resource for vision-language and language only tasks.
no code implementations • 23 Mar 2023 • Haoxuan You, Mandy Guo, Zhecan Wang, Kai-Wei Chang, Jason Baldridge, Jiahui Yu
The field of vision and language has witnessed a proliferation of pre-trained foundation models.
no code implementations • 17 Mar 2023 • Joshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontañón, Siddhartha Brahma, Yury Zemlyanskiy, David Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai
Many natural language processing tasks benefit from long inputs, but processing long documents with Transformers is expensive -- not only due to quadratic attention complexity but also from applying feedforward and projection layers to every token.
Ranked #1 on
Long-range modeling
on SCROLLS
1 code implementation • Findings (NAACL) 2022 • Mandy Guo, Joshua Ainslie, David Uthus, Santiago Ontanon, Jianmo Ni, Yun-Hsuan Sung, Yinfei Yang
Recent work has shown that either (1) increasing the input length or (2) increasing model size can improve the performance of Transformer-based neural models.
Ranked #1 on
Text Summarization
on BigPatent
no code implementations • 10 Sep 2021 • Aashi Jain, Mandy Guo, Krishna Srinivasan, Ting Chen, Sneha Kudugunta, Chao Jia, Yinfei Yang, Jason Baldridge
Both image-caption pairs and translation pairs provide the means to learn deep representations of and connections between languages.
Ranked #1 on
Semantic Image-Text Similarity
on CxC
no code implementations • 30 Jul 2021 • Xavier Garcia, Noah Constant, Mandy Guo, Orhan Firat
In this work, we take the first steps towards building a universal rewriter: a model capable of rewriting text in any language to exhibit a wide variety of attributes, including styles and languages, while preserving as much of the original semantics as possible.
1 code implementation • ACL 2021 • Parker Riley, Noah Constant, Mandy Guo, Girish Kumar, David Uthus, Zarana Parekh
Unlike previous approaches requiring style-labeled training data, our method makes use of readily-available unlabeled text by relying on the implicit connection in style between adjacent sentences, and uses labeled data only at inference time.
no code implementations • ACL 2021 • Yinfei Yang, Ning Jin, Kuo Lin, Mandy Guo, Daniel Cer
Independently computing embeddings for questions and answers results in late fusion of information related to matching questions to their answers.
no code implementations • 28 Sep 2020 • Parker Riley, Noah Constant, Mandy Guo, Girish Kumar, David Uthus, Zarana Parekh
We present a novel approach to the challenging problem of label-free text style transfer.
1 code implementation • 5 May 2020 • Mandy Guo, Yinfei Yang, Daniel Cer, Qinlan Shen, Noah Constant
Retrieval question answering (ReQA) is the task of retrieving a sentence-level answer to a question from an open corpus (Ahmad et al., 2019). This paper presents MultiReQA, anew multi-domain ReQA evaluation suite com-posed of eight retrieval QA tasks drawn from publicly available QA datasets.
no code implementations • 27 Aug 2019 • Dokook Choe, Rami Al-Rfou, Mandy Guo, Heeyoung Lee, Noah Constant
Purely character-based language models (LMs) have been lagging in quality on large scale datasets, and current state-of-the-art LMs rely on word tokenization.
no code implementations • ACL 2020 • Yinfei Yang, Daniel Cer, Amin Ahmad, Mandy Guo, Jax Law, Noah Constant, Gustavo Hernandez Abrego, Steve Yuan, Chris Tar, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil
We introduce two pre-trained retrieval focused multilingual sentence encoding models, respectively based on the Transformer and CNN model architectures.
no code implementations • WS 2019 • Mandy Guo, Yinfei Yang, Keith Stevens, Daniel Cer, Heming Ge, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil
We explore using multilingual document embeddings for nearest neighbor mining of parallel data.
no code implementations • 22 Feb 2019 • Yinfei Yang, Gustavo Hernandez Abrego, Steve Yuan, Mandy Guo, Qinlan Shen, Daniel Cer, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil
On the UN document-level retrieval task, document embeddings achieve around 97% on P@1 for all experimented language pairs.
2 code implementations • 9 Aug 2018 • Rami Al-Rfou, Dokook Choe, Noah Constant, Mandy Guo, Llion Jones
LSTMs and other RNN variants have shown strong performance on character-level language modeling.
Ranked #8 on
Language Modelling
on Hutter Prize
no code implementations • WS 2018 • Mandy Guo, Qinlan Shen, Yinfei Yang, Heming Ge, Daniel Cer, Gustavo Hernandez Abrego, Keith Stevens, Noah Constant, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil
This paper presents an effective approach for parallel corpus mining using bilingual sentence embeddings.