no code implementations • EMNLP 2021 • Jing Lu, Gustavo Hernandez Abrego, Ji Ma, Jianmo Ni, Yinfei Yang
In the context of neural passage retrieval, we study three promising techniques: synthetic data generation, negative sampling, and fusion.
no code implementations • EMNLP 2020 • Bodhisattwa Prasad Majumder, Shuyang Li, Jianmo Ni, Julian McAuley
In this work, we perform the first large-scale analysis of discourse in media dialog and its impact on generative modeling of dialog turns, with a focus on interrogative patterns and use of external knowledge.
no code implementations • 15 Feb 2024 • Noveen Sachdeva, Benjamin Coleman, Wang-Cheng Kang, Jianmo Ni, Lichan Hong, Ed H. Chi, James Caverlee, Julian McAuley, Derek Zhiyuan Cheng
The training of large language models (LLMs) is expensive.
1 code implementation • 10 Nov 2023 • Nandan Thakur, Jianmo Ni, Gustavo Hernández Ábrego, John Wieting, Jimmy Lin, Daniel Cer
There has been limited success for dense retrieval models in multilingual retrieval, due to uneven and scarce training data available across multiple languages.
no code implementations • 15 Oct 2023 • Noveen Sachdeva, Zexue He, Wang-Cheng Kang, Jianmo Ni, Derek Zhiyuan Cheng, Julian McAuley
We study data distillation for auto-regressive machine learning tasks, where the input and output have a strict left-to-right causal structure.
no code implementations • 10 May 2023 • Wang-Cheng Kang, Jianmo Ni, Nikhil Mehta, Maheswaran Sathiamoorthy, Lichan Hong, Ed Chi, Derek Zhiyuan Cheng
In this paper, we conduct a thorough examination of both CF and LLMs within the classic task of user rating prediction, which involves predicting a user's rating for a candidate item based on their past ratings.
2 code implementations • 9 May 2023 • Andrea Burns, Krishna Srinivasan, Joshua Ainslie, Geoff Brown, Bryan A. Plummer, Kate Saenko, Jianmo Ni, Mandy Guo
Webpages have been a rich resource for language and vision-language tasks.
1 code implementation • 5 May 2023 • Andrea Burns, Krishna Srinivasan, Joshua Ainslie, Geoff Brown, Bryan A. Plummer, Kate Saenko, Jianmo Ni, Mandy Guo
Webpages have been a rich, scalable resource for vision-language and language only tasks.
no code implementations • 20 Dec 2022 • Jing Lu, Keith Hall, Ji Ma, Jianmo Ni
We present Hybrid Infused Reranking for Passages Retrieval (HYRR), a framework for training rerankers based on a hybrid of BM25 and neural retrieval models.
no code implementations • 17 Dec 2022 • David Uthus, Jianmo Ni
RISE is first trained as a retrieval task using a dual-encoder retrieval setup, and can then be subsequently utilized for evaluating a generated summary given an input document, without gold reference summaries.
1 code implementation • 15 Dec 2022 • Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Massimiliano Ciaramita, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Lierni Sestorain Saralegui, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, Kellie Webster
We take human annotations as a gold standard and show that a correlated automatic metric is suitable for development.
no code implementations • 12 Oct 2022 • Honglei Zhuang, Zhen Qin, Rolf Jagerman, Kai Hui, Ji Ma, Jing Lu, Jianmo Ni, Xuanhui Wang, Michael Bendersky
Recently, substantial progress has been made in text ranking based on pretrained language models such as BERT.
no code implementations • 10 Oct 2022 • Cicero Nogueira dos santos, Zhe Dong, Daniel Cer, John Nham, Siamak Shakeri, Jianmo Ni, Yun-Hsuan Sung
The resulting soft knowledge prompts (KPs) are task independent and work as an external memory of the LMs.
no code implementations • 23 Sep 2022 • Zhuyun Dai, Vincent Y. Zhao, Ji Ma, Yi Luan, Jianmo Ni, Jing Lu, Anton Bakalov, Kelvin Guu, Keith B. Hall, Ming-Wei Chang
To amplify the power of a few examples, we propose Prompt-base Query Generation for Retriever (Promptagator), which leverages large language models (LLM) as a few-shot query generator, and creates task-specific retrievers based on the generated data.
no code implementations • 27 Jun 2022 • Li Zhang, Yan Ge, Jun Ma, Jianmo Ni, Haiping Lu
In this paper, we propose to incorporate the knowledge graph (KG) for CDR, which enables items in different domains to share knowledge.
1 code implementation • 14 Apr 2022 • Zhe Dong, Jianmo Ni, Daniel M. Bikel, Enrique Alfonseca, YuAn Wang, Chen Qu, Imed Zitouni
We further explore and explain why parameter sharing in projection layer significantly improves the efficacy of the dual encoders, by directly probing the embedding spaces of the two encoder towers with t-SNE algorithm.
3 code implementations • 31 Mar 2022 • Adam Roberts, Hyung Won Chung, Anselm Levskaya, Gaurav Mishra, James Bradbury, Daniel Andor, Sharan Narang, Brian Lester, Colin Gaffney, Afroz Mohiuddin, Curtis Hawthorne, Aitor Lewkowycz, Alex Salcianu, Marc van Zee, Jacob Austin, Sebastian Goodman, Livio Baldini Soares, Haitang Hu, Sasha Tsvyashchenko, Aakanksha Chowdhery, Jasmijn Bastings, Jannis Bulian, Xavier Garcia, Jianmo Ni, Andrew Chen, Kathleen Kenealy, Jonathan H. Clark, Stephan Lee, Dan Garrette, James Lee-Thorp, Colin Raffel, Noam Shazeer, Marvin Ritter, Maarten Bosma, Alexandre Passos, Jeremy Maitin-Shepard, Noah Fiedel, Mark Omernick, Brennan Saeta, Ryan Sepassi, Alexander Spiridonov, Joshua Newlan, Andrea Gesmundo
Recent neural network-based language models have benefited greatly from scaling up the size of training datasets and the number of parameters in the models themselves.
1 code implementation • 14 Feb 2022 • Yi Tay, Vinh Q. Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W. Cohen, Donald Metzler
In this paper, we demonstrate that information retrieval can be accomplished with a single Transformer, in which all information about the corpus is encoded in the parameters of the model.
2 code implementations • 15 Dec 2021 • Jianmo Ni, Chen Qu, Jing Lu, Zhuyun Dai, Gustavo Hernández Ábrego, Ji Ma, Vincent Y. Zhao, Yi Luan, Keith B. Hall, Ming-Wei Chang, Yinfei Yang
With multi-stage training, surprisingly, scaling up the model size brings significant improvement on a variety of retrieval tasks, especially for out-of-domain generalization.
Ranked #9 on Zero-shot Text Search on BEIR
1 code implementation • Findings (NAACL) 2022 • Mandy Guo, Joshua Ainslie, David Uthus, Santiago Ontanon, Jianmo Ni, Yun-Hsuan Sung, Yinfei Yang
Recent work has shown that either (1) increasing the input length or (2) increasing model size can improve the performance of Transformer-based neural models.
Ranked #1 on Text Summarization on BigPatent
3 code implementations • ICLR 2022 • Vamsi Aribandi, Yi Tay, Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Gupta, Kai Hui, Sebastian Ruder, Donald Metzler
Despite the recent success of multi-task learning and transfer learning for natural language processing (NLP), few works have systematically studied the effect of scaling up the number of tasks during pre-training.
2 code implementations • Findings (ACL) 2022 • Jianmo Ni, Gustavo Hernández Ábrego, Noah Constant, Ji Ma, Keith B. Hall, Daniel Cer, Yinfei Yang
To support our investigation, we establish a new sentence representation transfer benchmark, SentGLUE, which extends the SentEval toolkit to nine tasks from the GLUE benchmark.
1 code implementation • 17 May 2021 • Shuyang Li, Yufei Li, Jianmo Ni, Julian McAuley
The large population of home cooks with dietary restrictions is under-served by existing cooking resources and recipe generation models.
no code implementations • 23 Oct 2020 • Jing Lu, Gustavo Hernandez Abrego, Ji Ma, Jianmo Ni, Yinfei Yang
In this paper we explore the effects of negative sampling in dual encoder models used to retrieve passages for automatic question answering.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Jianmo Ni, Chun-Nan Hsu, Amilcare Gentili, Julian McAuley
In this work, we focus on reporting abnormal findings on radiology images; instead of training on complete radiology reports, we propose a method to identify abnormal findings from the reports in addition to grouping them with unsupervised clustering and minimal rules.
no code implementations • 7 Apr 2020 • Bodhisattwa Prasad Majumder, Shuyang Li, Jianmo Ni, Julian McAuley
Compared to existing large-scale proxies for conversational data, language models trained on our dataset exhibit better zero-shot out-of-domain performance on existing spoken dialog datasets, demonstrating its usefulness in modeling real-world conversations.
1 code implementation • 4 Dec 2019 • Mengting Wan, Jianmo Ni, Rishabh Misra, Julian McAuley
However, these interactions can be biased by how the product is marketed, for example due to the selection of a particular human model in a product image.
no code implementations • IJCNLP 2019 • Jianmo Ni, Jiacheng Li, Julian McAuley
Several recent works have considered the problem of generating reviews (or {`}tips{'}) as a form of explanation as to why a recommendation might match a customer{'}s interests.
1 code implementation • IJCNLP 2019 • Liliang Ren, Jianmo Ni, Julian McAuley
Experiments on both the multi-domain and the single domain dialogue state tracking dataset show that our model not only scales easily with the increasing number of pre-defined domains and slots but also reaches the state-of-the-art performance.
Ranked #14 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.0
1 code implementation • IJCNLP 2019 • Bodhisattwa Prasad Majumder, Shuyang Li, Jianmo Ni, Julian McAuley
Existing approaches to recipe generation are unable to create recipes for users with culinary preferences but incomplete knowledge of ingredients in specific dishes.
Ranked #1 on Recipe Generation on Food.com
3 code implementations • NAACL 2019 • Jianmo Ni, Chenguang Zhu, Weizhu Chen, Julian McAuley
In this paper we propose a retriever-reader model that learns to attend on essential terms during the question answering process.
Multiple-choice Multiple Choice Question Answering (MCQA) +2
1 code implementation • ACL 2018 • Jianmo Ni, Julian McAuley
In this paper, we focus on the problem of building assistive systems that can help users to write reviews.
no code implementations • IJCNLP 2017 • Jianmo Ni, Zachary C. Lipton, Sharad Vikram, Julian McAuley
Natural language approaches that model information like product reviews have proved to be incredibly useful in improving the performance of such methods, as reviews provide valuable auxiliary information that can be used to better estimate latent user preferences and item properties.