1 code implementation • 13 Mar 2023 • Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Daniel Y. Fu, Zhiqiang Xie, Beidi Chen, Clark Barrett, Joseph E. Gonzalez, Percy Liang, Christopher Ré, Ion Stoica, Ce Zhang
As a result, when running OPT-175B on a single 16GB GPU, FlexGen achieves significantly higher throughput compared to state-of-the-art offloading systems, reaching a generation throughput of 1 token/s for the first time with an effective batch size of 144.
1 code implementation • 21 Feb 2023 • Michael Poli, Stefano Massaroli, Eric Nguyen, Daniel Y. Fu, Tri Dao, Stephen Baccus, Yoshua Bengio, Stefano Ermon, Christopher Ré
Recent advances in deep learning have relied heavily on the use of large Transformers due to their ability to learn at scale.
Ranked #25 on Language Modelling on WikiText-103
1 code implementation • 13 Feb 2023 • Daniel Y. Fu, Elliot L. Epstein, Eric Nguyen, Armin W. Thomas, Michael Zhang, Tri Dao, Atri Rudra, Christopher Ré
We find that a key requirement to achieving high performance is keeping the convolution kernels smooth.
2 code implementations • 28 Dec 2022 • Tri Dao, Daniel Y. Fu, Khaled K. Saab, Armin W. Thomas, Atri Rudra, Christopher Ré
First, we use synthetic language modeling tasks to understand the gap between SSMs and attention.
Ranked #1 on Language Modelling on WikiText-103 (using extra training data)
no code implementations • 10 Jun 2022 • Trenton Chang, Daniel Y. Fu
In a simulation study, we investigate (1) what artifacts networking corruptions cause, (2) how such artifacts affect ML models, and (3) whether standard robustness methods can mitigate their negative effects.
2 code implementations • 27 May 2022 • Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
We also extend FlashAttention to block-sparse attention, yielding an approximate attention algorithm that is faster than any existing approximate attention method.
1 code implementation • Findings (ACL) 2022 • Megan Leszczynski, Daniel Y. Fu, Mayee F. Chen, Christopher Ré
Entity retrieval--retrieving information about entity mentions in a query--is a key step in open-domain tasks, such as question answering or fact checking.
1 code implementation • 15 Apr 2022 • Mayee F. Chen, Daniel Y. Fu, Avanika Narayan, Michael Zhang, Zhao Song, Kayvon Fatahalian, Christopher Ré
We first prove that adding a weighted class-conditional InfoNCE loss to SupCon controls the degree of spread.
1 code implementation • 24 Mar 2022 • Mayee F. Chen, Daniel Y. Fu, Dyah Adila, Michael Zhang, Frederic Sala, Kayvon Fatahalian, Christopher Ré
Despite the black-box nature of foundation models, we prove results characterizing how our approach improves performance and show that lift scales with the smoothness of label distributions in embedding space.
1 code implementation • 26 Jun 2020 • Mayee F. Chen, Daniel Y. Fu, Frederic Sala, Sen Wu, Ravi Teja Mullapudi, Fait Poms, Kayvon Fatahalian, Christopher Ré
Our goal is to enable machine learning systems to be trained interactively.
1 code implementation • ICML 2020 • Daniel Y. Fu, Mayee F. Chen, Frederic Sala, Sarah M. Hooper, Kayvon Fatahalian, Christopher Ré
In this work, we show that, for a class of latent variable models highly applicable to weak supervision, we can find a closed-form solution to model parameters, obviating the need for iterative solutions like stochastic gradient descent (SGD).
no code implementations • NeurIPS 2019 • Frederic Sala, Paroma Varma, Jason Fries, Daniel Y. Fu, Shiori Sagawa, Saelig Khattar, Ashwini Ramamoorthy, Ke Xiao, Kayvon Fatahalian, James Priest, Christopher Ré
Multi-resolution sources exacerbate this challenge due to complex correlations and sample complexity that scales in the length of the sequence.
1 code implementation • 7 Oct 2019 • Daniel Y. Fu, Will Crichton, James Hong, Xinwei Yao, Haotian Zhang, Anh Truong, Avanika Narayan, Maneesh Agrawala, Christopher Ré, Kayvon Fatahalian
Many real-world video analysis applications require the ability to identify domain-specific events in video, such as interviews and commercials in TV news broadcasts, or action sequences in film.
no code implementations • 23 Apr 2018 • Daniel Y. Fu, Emily S. Wang, Peter M. Krafft, Barbara J. Grosz
In the interest of learning how to control flocking behavior, recent work in the multiagent systems literature has explored the use of influencing agents for guiding flocking agents to face a target direction.