1 code implementation • Findings (ACL) 2022 • Simran Arora, Sen Wu, Enci Liu, Christopher Re
We observe proposed methods typically start with a base LM and data that has been annotated with entity metadata, then change the model, by modifying the architecture or introducing auxiliary loss terms to better capture entity knowledge.
no code implementations • 14 Feb 2025 • Anne Ouyang, Simon Guo, Simran Arora, Alex L. Zhang, William Hu, Christopher Ré, Azalia Mirhoseini
Efficient GPU kernels are crucial for building performant machine learning architectures, but writing them is a time-consuming challenge that requires significant expertise; therefore, we explore using language models (LMs) to automate kernel generation.
1 code implementation • 27 Oct 2024 • Benjamin F. Spector, Simran Arora, Aaryan Singhal, Daniel Y. Fu, Christopher Ré
We match CuBLAS and FlashAttention-3 on GEMM and attention inference performance and outperform the strongest baselines by $10-40\%$ on attention backwards, $8\times$ on state space models, and $14\times$ on linear attention.
1 code implementation • 14 Oct 2024 • Michael Zhang, Simran Arora, Rahul Chalamala, Alan Wu, Benjamin Spector, Aaryan Singhal, Krithik Ramesh, Christopher Ré
When compared with prior approaches under the same compute budgets, LoLCATs significantly improves linearizing quality, closing the gap between linearized and original Llama 3. 1 70B and 405B LLMs by 77. 8% and 78. 1% on 5-shot MMLU.
1 code implementation • 7 Jul 2024 • Simran Arora, Aman Timalsina, Aaryan Singhal, Benjamin Spector, Sabri Eyuboglu, Xinyi Zhao, Ashish Rao, Atri Rudra, Christopher Ré
Recurrent large language models that compete with Transformers in language modeling perplexity are emerging at a rapid rate (e. g., Mamba, RWKV).
1 code implementation • 14 Mar 2024 • Megha Srivastava, Simran Arora, Dan Boneh
The increasing compute demands of AI systems have led to the emergence of services that train models on behalf of clients lacking necessary resources.
3 code implementations • 28 Feb 2024 • Simran Arora, Sabri Eyuboglu, Michael Zhang, Aman Timalsina, Silas Alberti, Dylan Zinsley, James Zou, Atri Rudra, Christopher Ré
In this work, we explore whether we can improve language model efficiency (e. g. by reducing memory consumption) without compromising on recall.
no code implementations • 12 Feb 2024 • Jon Saad-Falcon, Daniel Y. Fu, Simran Arora, Neel Guha, Christopher Ré
Retrieval pipelines-an integral component of many machine learning systems-perform poorly in domains where documents are long (e. g., 10K tokens or more) and where identifying the relevant document requires synthesizing information across the entire text.
3 code implementations • 8 Dec 2023 • Simran Arora, Sabri Eyuboglu, Aman Timalsina, Isys Johnson, Michael Poli, James Zou, Atri Rudra, Christopher Ré
To close the gap between synthetics and real language, we develop a new formalization of the task called multi-query associative recall (MQAR) that better reflects actual language.
no code implementations • 28 Nov 2023 • Furui Cheng, Vilém Zouhar, Simran Arora, Mrinmaya Sachan, Hendrik Strobelt, Mennatallah El-Assady
To address this challenge, we propose an interactive system that helps users gain insight into the reliability of the generated text.
1 code implementation • NeurIPS 2023 • Daniel Y. Fu, Simran Arora, Jessica Grogan, Isys Johnson, Sabri Eyuboglu, Armin W. Thomas, Benjamin Spector, Michael Poli, Atri Rudra, Christopher Ré
We ask: are there performant architectures that can scale sub-quadratically along sequence length and model dimension?
1 code implementation • 21 Jun 2023 • Soumya Chatterjee, Omar Khattab, Simran Arora
We introduce and define the novel problem of multi-distribution information retrieval (IR) where given a query, systems need to retrieve passages from within multiple collections, each drawn from a different distribution.
no code implementations • NeurIPS 2023 • Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li
Yet, while the literature on the trustworthiness of GPT models remains limited, practitioners have proposed employing capable GPT models for sensitive applications such as healthcare and finance -- where mistakes can be costly.
1 code implementation • 19 Apr 2023 • Simran Arora, Brandon Yang, Sabri Eyuboglu, Avanika Narayan, Andrew Hojel, Immanuel Trummer, Christopher Ré
Code synthesis is cheap, but far less accurate than directly processing each document with the LLM.
3 code implementations • 5 Oct 2022 • Simran Arora, Avanika Narayan, Mayee F. Chen, Laurel Orr, Neel Guha, Kush Bhatia, Ines Chami, Frederic Sala, Christopher Ré
Prompting is a brittle process wherein small modifications to the prompt can cause large variations in the model predictions, and therefore significant effort is dedicated towards designing a painstakingly "perfect prompt" for a task.
Ranked #1 on
Question Answering
on Story Cloze
1 code implementation • 27 May 2022 • Simran Arora, Christopher Ré
However, privacy and quality appear to be in tension in existing systems for personal tasks.
2 code implementations • 20 May 2022 • Avanika Narayan, Ines Chami, Laurel Orr, Simran Arora, Christopher Ré
Foundation Models (FMs) are models trained on large corpora of data that, at very large scale, can generalize to new tasks without any task-specific finetuning.
Ranked #11 on
Entity Resolution
on Amazon-Google
1 code implementation • 14 Mar 2022 • Simran Arora, Patrick Lewis, Angela Fan, Jacob Kahn, Christopher Ré
We first define the PUBLIC-PRIVATE AUTOREGRESSIVE INFORMATION RETRIEVAL (PAIR) privacy framework for the novel retrieval setting over multiple privacy scopes.
Ranked #1 on
Multi-hop Question Answering
on ConcurrentQA
1 code implementation • 16 Oct 2021 • Simran Arora, Sen Wu, Enci Liu, Christopher Re
Since rare entities and facts are prevalent in the queries users submit to popular applications such as search and personal assistant systems, improving the ability of LMs to reliably capture knowledge over rare entities is a pressing challenge studied in significant prior work.
2 code implementations • 16 Aug 2021 • Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang
AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.
1 code implementation • 20 Oct 2020 • Laurel Orr, Megan Leszczynski, Simran Arora, Sen Wu, Neel Guha, Xiao Ling, Christopher Re
A challenge for named entity disambiguation (NED), the task of mapping textual mentions to entities in a knowledge base, is how to disambiguate entities that appear rarely in the training data, termed tail entities.
Ranked #1 on
Entity Disambiguation
on AIDA-CoNLL
(Micro-F1 metric)
no code implementations • 1 Oct 2020 • Simran Arora, P. K. Sahoo
The gravitational action $L$ is given by an arbitrary function $f$ of the non-metricity $Q$ and the trace of the matter-energy momentum tensor $T$.
General Relativity and Quantum Cosmology High Energy Physics - Theory
no code implementations • 1 Sep 2020 • Simran Arora, J. R. L. Santos, P. K. Sahoo
We are living in a golden age for experimental cosmology.
General Relativity and Quantum Cosmology High Energy Physics - Theory
no code implementations • ACL 2020 • Simran Arora, Avner May, Jian Zhang, Christopher Ré
We study the settings for which deep contextual embeddings (e. g., BERT) give large improvements in performance relative to classic pretrained embeddings (e. g., GloVe), and an even simpler baseline---random word embeddings---focusing on the impact of the training set size and the linguistic properties of the task.