2 code implementations • 8 Oct 2024 • Christopher Fifty, Ronald G. Junkins, Dennis Duan, Aniketh Iger, Jerry W. Liu, Ehsan Amid, Sebastian Thrun, Christopher Ré
However, as vector quantization is non-differentiable, the gradient to the encoder flows around the vector quantization layer rather than through it in a straight-through approximation.
1 code implementation • 17 Oct 2023 • Christopher Fifty, Dennis Duan, Ronald G. Junkins, Ehsan Amid, Jure Leskovec, Christopher Re, Sebastian Thrun
Large Language Models like ChatGPT demonstrate a remarkable capacity to learn new concepts during inference without any fine-tuning.
Ranked #1 on Few-Shot Image Classification on Tiered ImageNet 5-way (5-shot) (using extra training data)
no code implementations • 13 Oct 2023 • Christopher Fifty, Jure Leskovec, Sebastian Thrun
In this paper, we adapt the concepts underpinning in-context learning to develop a new algorithm for few-shot molecular property prediction.
1 code implementation • 4 Feb 2023 • Christopher Fifty, Joseph M. Paggi, Ehsan Amid, Jure Leskovec, Ron Dror
However, many important molecular properties depend on complex molecular characteristics -- such as the various 3D geometries a molecule may adopt or the types of chemical interactions it can form -- that are not explicitly encoded in the feature space and must be approximated from low amounts of data.
no code implementations • 15 Sep 2022 • Ehsan Amid, Rohan Anil, Christopher Fifty, Manfred K. Warmuth
In this work, we propose a novel approach for layerwise representation learning of a trained neural network.
2 code implementations • 13 Jul 2022 • Aurko Roy, Rohan Anil, Guangda Lai, Benjamin Lee, Jeffrey Zhao, Shuyuan Zhang, Shibo Wang, Ye Zhang, Shen Wu, Rigel Swavely, Tao, Yu, Phuong Dao, Christopher Fifty, Zhifeng Chen, Yonghui Wu
Transformer models have recently emerged as one of the foundational models in natural language processing, and as a byproduct, there is significant recent interest and investment in scaling these models.
Ranked #7 on Language Modelling on C4
no code implementations • 31 Jan 2022 • Ehsan Amid, Rohan Anil, Christopher Fifty, Manfred K. Warmuth
In this paper, we update the step-size scale and the gain variables with exponentiated gradient updates instead.
no code implementations • 14 Dec 2021 • BoWen Zhang, Jiahui Yu, Christopher Fifty, Wei Han, Andrew M. Dai, Ruoming Pang, Fei Sha
We term this approach as Co-training Videos and Images for Action Recognition (CoVeR).
Ranked #6 on Action Classification on MiT (using extra training data)
1 code implementation • NeurIPS 2021 • Christopher Fifty, Ehsan Amid, Zhe Zhao, Tianhe Yu, Rohan Anil, Chelsea Finn
Multi-task learning can leverage information learned by one task to benefit the training of other tasks.
no code implementations • 29 Oct 2020 • Christopher Fifty, Ehsan Amid, Zhe Zhao, Tianhe Yu, Rohan Anil, Chelsea Finn
Multi-task learning can leverage information learned by one task to benefit the training of other tasks.
no code implementations • 13 Aug 2020 • Yuyan Wang, Zhe Zhao, Bo Dai, Christopher Fifty, Dong Lin, Lichan Hong, Ed H. Chi
A delicate balance between multi-task generalization and multi-objective optimization is therefore needed for finding a better trade-off between efficiency and generalization.
7 code implementations • 19 Feb 2019 • Felix Wu, Tianyi Zhang, Amauri Holanda de Souza Jr., Christopher Fifty, Tao Yu, Kilian Q. Weinberger
Graph Convolutional Networks (GCNs) and their variants have experienced significant attention and have become the de facto methods for learning graph representations.
Ranked #3 on Text Classification on Ohsumed