Search Results for author: Daniel Cer

Found 40 papers, 14 papers with code

MultiReQA: A Cross-Domain Evaluation forRetrieval Question Answering Models

1 code implementation EACL (AdaptNLP) 2021 Mandy Guo, Yinfei Yang, Daniel Cer, Qinlan Shen, Noah Constant

Retrieval question answering (ReQA) is the task of retrieving a sentence-level answer to a question from an open corpus (Ahmad et al., 2019). This dataset paper presents MultiReQA, a new multi-domain ReQA evaluation suite composed of eight retrieval QA tasks drawn from publicly available QA datasets.

Information Retrieval Question Answering +3

Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems

no code implementations2 Apr 2024 Frank Palma Gomez, Ramon Sanabria, Yun-Hsuan Sung, Daniel Cer, Siddharth Dalmia, Gustavo Hernandez Abrego

Our multi-modal LLM-based retrieval system is capable of matching speech and text in 102 languages despite only training on 21 languages.

Machine Translation Retrieval +1

Gemma: Open Models Based on Gemini Research and Technology

no code implementations13 Mar 2024 Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent SIfre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Léonard Hussenot, Pier Giuseppe Sessa, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev, Alex Castro-Ros, Ambrose Slone, Amélie Héliou, Andrea Tacchetti, Anna Bulanova, Antonia Paterson, Beth Tsai, Bobak Shahriari, Charline Le Lan, Christopher A. Choquette-Choo, Clément Crepy, Daniel Cer, Daphne Ippolito, David Reid, Elena Buchatskaya, Eric Ni, Eric Noland, Geng Yan, George Tucker, George-Christian Muraru, Grigory Rozhdestvenskiy, Henryk Michalewski, Ian Tenney, Ivan Grishchenko, Jacob Austin, James Keeling, Jane Labanowski, Jean-Baptiste Lespiau, Jeff Stanway, Jenny Brennan, Jeremy Chen, Johan Ferret, Justin Chiu, Justin Mao-Jones, Katherine Lee, Kathy Yu, Katie Millican, Lars Lowe Sjoesund, Lisa Lee, Lucas Dixon, Machel Reid, Maciej Mikuła, Mateo Wirth, Michael Sharman, Nikolai Chinaev, Nithum Thain, Olivier Bachem, Oscar Chang, Oscar Wahltinez, Paige Bailey, Paul Michel, Petko Yotov, Rahma Chaabouni, Ramona Comanescu, Reena Jana, Rohan Anil, Ross Mcilroy, Ruibo Liu, Ryan Mullins, Samuel L Smith, Sebastian Borgeaud, Sertan Girgin, Sholto Douglas, Shree Pandya, Siamak Shakeri, Soham De, Ted Klimenko, Tom Hennigan, Vlad Feinberg, Wojciech Stokowiec, Yu-Hui Chen, Zafarali Ahmed, Zhitao Gong, Tris Warkentin, Ludovic Peran, Minh Giang, Clément Farabet, Oriol Vinyals, Jeff Dean, Koray Kavukcuoglu, Demis Hassabis, Zoubin Ghahramani, Douglas Eck, Joelle Barral, Fernando Pereira, Eli Collins, Armand Joulin, Noah Fiedel, Evan Senter, Alek Andreev, Kathleen Kenealy

This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models.

Leveraging LLMs for Synthesizing Training Data Across Many Languages in Multilingual Dense Retrieval

1 code implementation10 Nov 2023 Nandan Thakur, Jianmo Ni, Gustavo Hernández Ábrego, John Wieting, Jimmy Lin, Daniel Cer

There has been limited success for dense retrieval models in multilingual retrieval, due to uneven and scarce training data available across multiple languages.

Language Modelling Large Language Model +1

Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation

1 code implementation25 May 2022 Tu Vu, Aditya Barua, Brian Lester, Daniel Cer, Mohit Iyyer, Noah Constant

In this paper, we explore the challenging problem of performing a generative task in a target language when labeled data is only available in English, using summarization as a case study.

Cross-Lingual Transfer Machine Translation +1

SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer

no code implementations ACL 2022 Tu Vu, Brian Lester, Noah Constant, Rami Al-Rfou, Daniel Cer

Finally, we propose an efficient retrieval approach that interprets task prompts as task embeddings to identify similar tasks and predict the most transferable source tasks for a novel target task.

Language Modelling Retrieval +1

A Simple and Effective Method To Eliminate the Self Language Bias in Multilingual Representations

1 code implementation EMNLP 2021 ZiYi Yang, Yinfei Yang, Daniel Cer, Eric Darve

A simple but highly effective method "Language Information Removal (LIR)" factors out language identity information from semantic related components in multilingual representations pre-trained on multi-monolingual data.

Cross-Lingual Transfer Retrieval

Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models

2 code implementations Findings (ACL) 2022 Jianmo Ni, Gustavo Hernández Ábrego, Noah Constant, Ji Ma, Keith B. Hall, Daniel Cer, Yinfei Yang

To support our investigation, we establish a new sentence representation transfer benchmark, SentGLUE, which extends the SentEval toolkit to nine tasks from the GLUE benchmark.

Contrastive Learning Semantic Textual Similarity +3

NT5?! Training T5 to Perform Numerical Reasoning

1 code implementation15 Apr 2021 Peng-Jian Yang, Ying Ting Chen, Yuechan Chen, Daniel Cer

Numerical reasoning over text (NRoT) presents unique challenges that are not well addressed by existing pre-training objectives.

Reading Comprehension

Universal Sentence Representation Learning with Conditional Masked Language Model

no code implementations EMNLP 2021 ZiYi Yang, Yinfei Yang, Daniel Cer, Jax Law, Eric Darve

This paper presents a novel training method, Conditional Masked Language Modeling (CMLM), to effectively learn sentence representations on large scale unlabeled corpora.

Language Modelling Masked Language Modeling +4

SeqGenSQL -- A Robust Sequence Generation Model for Structured Query Language

2 code implementations7 Nov 2020 Ning li, Bethany Keller, Mark Butler, Daniel Cer

We explore using T5 (Raffel et al. (2019)) to directly translate natural language questions into SQL statements.

Text Generation Text-To-SQL

Neural Retrieval for Question Answering with Cross-Attention Supervised Data Augmentation

no code implementations ACL 2021 Yinfei Yang, Ning Jin, Kuo Lin, Mandy Guo, Daniel Cer

Independently computing embeddings for questions and answers results in late fusion of information related to matching questions to their answers.

Data Augmentation Question Answering +1

Language-agnostic BERT Sentence Embedding

6 code implementations ACL 2022 Fangxiaoyu Feng, Yinfei Yang, Daniel Cer, Naveen Arivazhagan, Wei Wang

While BERT is an effective method for learning monolingual sentence embeddings for semantic similarity and embedding based transfer learning (Reimers and Gurevych, 2019), BERT based cross-lingual sentence embeddings have yet to be explored.

Language Modelling Masked Language Modeling +11

MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering Models

1 code implementation5 May 2020 Mandy Guo, Yinfei Yang, Daniel Cer, Qinlan Shen, Noah Constant

Retrieval question answering (ReQA) is the task of retrieving a sentence-level answer to a question from an open corpus (Ahmad et al., 2019). This paper presents MultiReQA, anew multi-domain ReQA evaluation suite com-posed of eight retrieval QA tasks drawn from publicly available QA datasets.

Information Retrieval Question Answering +2

ReQA: An Evaluation for End-to-End Answer Retrieval Models

1 code implementation WS 2019 Amin Ahmad, Noah Constant, Yinfei Yang, Daniel Cer

Popular QA benchmarks like SQuAD have driven progress on the task of identifying answer spans within a specific passage, with models now surpassing human performance.

Information Retrieval Question Answering +2

Universal Sentence Encoder

23 code implementations29 Mar 2018 Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil

For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance.

Conversational Response Selection Semantic Textual Similarity +7

Cannot find the paper you are looking for? You can Submit a new open access paper.