1 code implementation • 17 May 2023 • Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vlad Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, Guy Gur-Ari, Steven Hand, Hadi Hashemi, Le Hou, Joshua Howland, Andrea Hu, Jeffrey Hui, Jeremy Hurwitz, Michael Isard, Abe Ittycheriah, Matthew Jagielski, Wenhao Jia, Kathleen Kenealy, Maxim Krikun, Sneha Kudugunta, Chang Lan, Katherine Lee, Benjamin Lee, Eric Li, Music Li, Wei Li, Yaguang Li, Jian Li, Hyeontaek Lim, Hanzhao Lin, Zhongtao Liu, Frederick Liu, Marcello Maggioni, Aroma Mahendru, Joshua Maynez, Vedant Misra, Maysam Moussalem, Zachary Nado, John Nham, Eric Ni, Andrew Nystrom, Alicia Parrish, Marie Pellat, Martin Polacek, Alex Polozov, Reiner Pope, Siyuan Qiao, Emily Reif, Bryan Richter, Parker Riley, Alex Castro Ros, Aurko Roy, Brennan Saeta, Rajkumar Samuel, Renee Shelby, Ambrose Slone, Daniel Smilkov, David R. So, Daniel Sohn, Simon Tokumine, Dasha Valter, Vijay Vasudevan, Kiran Vodrahalli, Xuezhi Wang, Pidong Wang, ZiRui Wang, Tao Wang, John Wieting, Yuhuai Wu, Kelvin Xu, Yunhan Xu, Linting Xue, Pengcheng Yin, Jiahui Yu, Qiao Zhang, Steven Zheng, Ce Zheng, Weikang Zhou, Denny Zhou, Slav Petrov, Yonghui Wu
Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.
Ranked #1 on
Question Answering
on TriviaQA
(using extra training data)
2 code implementations • 13 Jul 2022 • Aurko Roy, Rohan Anil, Guangda Lai, Benjamin Lee, Jeffrey Zhao, Shuyuan Zhang, Shibo Wang, Ye Zhang, Shen Wu, Rigel Swavely, Tao, Yu, Phuong Dao, Christopher Fifty, Zhifeng Chen, Yonghui Wu
Transformer models have recently emerged as one of the foundational models in natural language processing, and as a byproduct, there is significant recent interest and investment in scaling these models.
Ranked #5 on
Language Modelling
on C4
2 code implementations • NAACL 2021 • Kalpesh Krishna, Aurko Roy, Mohit Iyyer
The task of long-form question answering (LFQA) involves retrieving documents relevant to a given question and using them to generate a paragraph-length answer.
Ranked #3 on
Question Answering
on KILT: ELI5
2 code implementations • 12 Mar 2020 • Aurko Roy, Mohammad Saffar, Ashish Vaswani, David Grangier
This work builds upon two lines of research: it combines the modeling flexibility of prior work on content-based sparse attention with the efficiency gains from approaches based on local, temporal sparse attention.
Ranked #5 on
Image Generation
on ImageNet 64x64
(Bits per dim metric)
no code implementations • ACL 2019 • Aurko Roy, David Grangier
We compare with MT-based approaches on paraphrase identification, generation, and training augmentation.
no code implementations • ICLR 2019 • Aurko Roy, Ashish Vaswani, Niki Parmar, Arvind Neelakantan
Deep neural networks with discrete latent variables offer the promise of better symbolic reasoning, and learning abstractions that are more useful to new tasks.
7 code implementations • ICLR 2019 • David Berthelot, Colin Raffel, Aurko Roy, Ian Goodfellow
Autoencoders provide a powerful framework for learning compressed representations by encoding all of the information needed to reconstruct a data point in a latent code.
2 code implementations • 28 May 2018 • Aurko Roy, Ashish Vaswani, Arvind Neelakantan, Niki Parmar
Deep neural networks with discrete latent variables offer the promise of better symbolic reasoning, and learning abstractions that are more useful to new tasks.
no code implementations • ICML 2018 • Łukasz Kaiser, Aurko Roy, Ashish Vaswani, Niki Parmar, Samy Bengio, Jakob Uszkoreit, Noam Shazeer
Finally, we evaluate our model end-to-end on the task of neural machine translation, where it is an order of magnitude faster at decoding than comparable autoregressive models.
no code implementations • ICLR 2018 • Jacob Buckman, Aurko Roy, Colin Raffel, Ian Goodfellow
It is well known that it is possible to construct "adversarial examples" for neural networks: inputs which are misclassified by the network yet indistinguishable from true data.
10 code implementations • 27 Dec 2017 • Tom B. Brown, Dandelion Mané, Aurko Roy, Martín Abadi, Justin Gilmer
We present a method to create universal, robust, targeted adversarial image patches in the real world.
no code implementations • NeurIPS 2017 • Aurko Roy, Huan Xu, Sebastian Pokutta
We study reinforcement learning under model misspecification, where we do not have access to the true environment but only to a reasonably close approximation to it.
2 code implementations • 9 Mar 2017 • Łukasz Kaiser, Ofir Nachum, Aurko Roy, Samy Bengio
We present a large-scale life-long memory module for use in deep learning.
no code implementations • NeurIPS 2016 • Aurko Roy, Sebastian Pokutta
We also prove that our algorithm returns an $O(\log{n})$-approximate hierarchical clustering for a generalization of this cost function also studied in [arXiv:1510. 05043].
13 code implementations • 3 Oct 2016 • Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, Alexey Kurakin, Cihang Xie, Yash Sharma, Tom Brown, Aurko Roy, Alexander Matyasko, Vahid Behzadan, Karen Hambardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley, Abhibhav Garg, Jonathan Uesato, Willi Gierke, Yinpeng Dong, David Berthelot, Paul Hendricks, Jonas Rauber, Rujun Long, Patrick McDaniel
An adversarial example library for constructing attacks, building defenses, and benchmarking both