1 code implementation • 16 Mar 2023 • Michael Zhang, Khaled K. Saab, Michael Poli, Tri Dao, Karan Goel, Christopher Ré
For expressivity, we propose a new SSM parameterization based on the companion matrix -- a canonical representation for discrete-time processes -- which enables SpaceTime's SSM layers to learn desirable autoregressive processes.
1 code implementation • 12 Oct 2022 • Eric Nguyen, Karan Goel, Albert Gu, Gordon W. Downs, Preey Shah, Tri Dao, Stephen A. Baccus, Christopher Ré
On ImageNet-1k, S4ND exceeds the performance of a Vision Transformer baseline by $1. 5\%$ when training with a $1$D sequence of patches, and matches ConvNeXt when modeling images in $2$D.
2 code implementations • 23 Jun 2022 • Albert Gu, Ankit Gupta, Karan Goel, Christopher Ré
On the other hand, a recent variant of S4 called DSS showed that restricting the state matrix to be fully diagonal can still preserve the performance of the original model when using a specific initialization based on approximating S4's matrix.
6 code implementations • 20 Feb 2022 • Karan Goel, Albert Gu, Chris Donahue, Christopher Ré
SaShiMi yields state-of-the-art performance for unconditional waveform generation in the autoregressive setting.
2 code implementations • 8 Nov 2021 • Avanika Narayan, Piero Molino, Karan Goel, Willie Neiswanger, Christopher Ré
LBT provides a configurable interface for controlling training and customizing evaluation, a standardized training framework for eliminating confounding variables, and support for multi-objective evaluation.
7 code implementations • ICLR 2022 • Albert Gu, Karan Goel, Christopher Ré
A central goal of sequence modeling is designing a single principled model that can address sequence data across a range of modalities and tasks, particularly on long-range dependencies.
2 code implementations • NeurIPS 2021 • Albert Gu, Isys Johnson, Karan Goel, Khaled Saab, Tri Dao, Atri Rudra, Christopher Ré
Recurrent neural networks (RNNs), temporal convolutions, and neural differential equations (NDEs) are popular families of deep learning models for time-series data, each with unique strengths and tradeoffs in modeling power and computational efficiency.
Ranked #2 on Sequential Image Classification on Sequential MNIST
2 code implementations • 16 Aug 2021 • Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang
AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.
no code implementations • 11 Aug 2021 • Laurel Orr, Atindriyo Sanyal, Xiao Ling, Karan Goel, Megan Leszczynski
The industrial machine learning pipeline requires iterating on model features, training and deploying models, and monitoring deployed models at scale.
1 code implementation • 1 Jul 2021 • Mayee Chen, Karan Goel, Nimit S. Sohoni, Fait Poms, Kayvon Fatahalian, Christopher Ré
If an unlabeled sample from the target distribution is available, along with a labeled sample from a possibly different source distribution, standard approaches such as importance weighting can be applied to estimate performance on the target.
no code implementations • NAACL 2021 • Karan Goel, Laurel Orr, Nazneen Fatema Rajani, Jesse Vig, Christopher R{\'e}
If not, how easily can such a system be repurposed for their use case?
no code implementations • NeurIPS 2021 • Albert Gu, Isys Johnson, Karan Goel, Khaled Kamal Saab, Tri Dao, Atri Rudra, Christopher Re
Recurrent neural networks (RNNs), temporal convolutions, and neural differential equations (NDEs) are popular families of deep learning models for time-series data, each with unique strengths and tradeoffs in modeling power and computational efficiency.
2 code implementations • ACL 2021 • Jesse Vig, Wojciech Kryściński, Karan Goel, Nazneen Fatema Rajani
Novel neural architectures, training strategies, and the availability of large-scale corpora haven been the driving force behind recent progress in abstractive text summarization.
2 code implementations • NAACL 2021 • Karan Goel, Nazneen Rajani, Jesse Vig, Samson Tan, Jason Wu, Stephan Zheng, Caiming Xiong, Mohit Bansal, Christopher Ré
Despite impressive performance on standard benchmarks, deep neural networks are often brittle when deployed in real-world systems.
1 code implementation • ICLR 2021 • Karan Goel, Albert Gu, Yixuan Li, Christopher Ré
Particularly concerning are models with inconsistent performance on specific subgroups of a class, e. g., exhibiting disparities in skin cancer classification in the presence or absence of a spurious bandage.
1 code implementation • ICLR 2019 • Karan Goel, Emma Brunskill
Given a dataset of time-series, the goal is to identify the latent sequence of steps common to them and label each time-series with the temporal extent of these procedural steps.
no code implementations • 17 Apr 2019 • Tong Mu, Karan Goel, Emma Brunskill
In many cases an intelligent agent may want to learn how to mimic a single observed demonstrated trajectory.
no code implementations • 21 Feb 2017 • Karan Goel, Christoph Dann, Emma Brunskill
Optimal stopping problems consider the question of deciding when to stop an observation-generating process in order to maximize a return.
1 code implementation • 12 Feb 2017 • Karan Goel, Shreya Rajpal, Mausam
We present Octopus, an AI agent to jointly balance three conflicting task objectives on a micro-crowdsourcing marketplace - the quality of work, total cost incurred, and time to completion.