no code implementations • 7 Dec 2024 • Juechu Dong, Boyuan Feng, Driss Guessous, Yanbo Liang, Horace He
We introduce FlexAttention, a novel compiler-driven programming model that allows implementing the majority of attention variants in a few lines of idiomatic PyTorch code.
10 code implementations • BigScience (ACL) 2022 • Sid Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, USVSN Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, Samuel Weinbach
We introduce GPT-NeoX-20B, a 20 billion parameter autoregressive language model trained on the Pile, whose weights will be made freely and openly available to the public through a permissive license.
Ranked #95 on Multi-task Language Understanding on MMLU
no code implementations • 15 Dec 2021 • James K. Reed, Zachary DeVito, Horace He, Ansley Ussery, Jason Ansel
Modern deep learning frameworks provide imperative, eager execution programming interfaces embedded in Python to provide a productive development experience.
2 code implementations • 30 Jun 2021 • Abhay Singh, Qian Huang, Sijia Linda Huang, Omkar Bhalerao, Horace He, Ser-Nam Lim, Austin R. Benson
Here, we demonstrate how simply adding a set of edges, which we call a \emph{proposal set}, to the graph as a pre-processing step can improve the performance of several link prediction algorithms.
Ranked #1 on Link Property Prediction on ogbl-ddi
3 code implementations • 20 May 2021 • Dan Hendrycks, Steven Basart, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, Jacob Steinhardt
Recent models such as GPT-Neo can pass approximately 20% of the test cases of introductory problems, so we find that machine learning models are now beginning to learn how to code.
Ranked #10 on Code Generation on APPS
21 code implementations • 31 Dec 2020 • Leo Gao, Stella Biderman, Sid Black, Laurence Golding, Travis Hoppe, Charles Foster, Jason Phang, Horace He, Anish Thite, Noa Nabeshima, Shawn Presser, Connor Leahy
Recent work has demonstrated that increased training dataset diversity improves general cross-domain knowledge and downstream generalization capability for large-scale language models.
Ranked #13 on Language Modelling on The Pile
no code implementations • 30 Nov 2020 • Benoit Steiner, Chris Cummins, Horace He, Hugh Leather
As machine learning techniques become ubiquitous, the efficiency of neural network implementations is becoming correspondingly paramount.
7 code implementations • ICLR 2021 • Qian Huang, Horace He, Abhay Singh, Ser-Nam Lim, Austin R. Benson
Graph Neural Networks (GNNs) are the predominant technique for learning over graphs.
Node Classification on Non-Homophilic (Heterophilic) Graphs Node Property Prediction
1 code implementation • NeurIPS 2020 • Qian Huang, Horace He, Abhay Singh, Yan Zhang, Ser-Nam Lim, Austin Benson
Incorporating relational reasoning into neural networks has greatly expanded their capabilities and scope.
2 code implementations • ICCV 2019 • Qian Huang, Isay Katsman, Horace He, Zeqi Gu, Serge Belongie, Ser-Nam Lim
We show that we can select a layer of the source model to perturb without any knowledge of the target models while achieving high transferability.
no code implementations • 4 Dec 2018 • Horace He, Aaron Lou, Qingxuan Jiang, Isay Katsman, Serge Belongie, Ser-Nam Lim
Research has shown that widely used deep neural networks are vulnerable to carefully crafted adversarial perturbations.
no code implementations • 20 Nov 2018 • Qian Huang, Zeqi Gu, Isay Katsman, Horace He, Pian Pawakapan, Zhiqiu Lin, Serge Belongie, Ser-Nam Lim
Neural networks are vulnerable to adversarial examples, malicious inputs crafted to fool trained models.