Search Results for author: Aurko Roy

Found 15 papers, 9 papers with code

PaLM 2 Technical Report

1 code implementation17 May 2023 Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vlad Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, Guy Gur-Ari, Steven Hand, Hadi Hashemi, Le Hou, Joshua Howland, Andrea Hu, Jeffrey Hui, Jeremy Hurwitz, Michael Isard, Abe Ittycheriah, Matthew Jagielski, Wenhao Jia, Kathleen Kenealy, Maxim Krikun, Sneha Kudugunta, Chang Lan, Katherine Lee, Benjamin Lee, Eric Li, Music Li, Wei Li, Yaguang Li, Jian Li, Hyeontaek Lim, Hanzhao Lin, Zhongtao Liu, Frederick Liu, Marcello Maggioni, Aroma Mahendru, Joshua Maynez, Vedant Misra, Maysam Moussalem, Zachary Nado, John Nham, Eric Ni, Andrew Nystrom, Alicia Parrish, Marie Pellat, Martin Polacek, Alex Polozov, Reiner Pope, Siyuan Qiao, Emily Reif, Bryan Richter, Parker Riley, Alex Castro Ros, Aurko Roy, Brennan Saeta, Rajkumar Samuel, Renee Shelby, Ambrose Slone, Daniel Smilkov, David R. So, Daniel Sohn, Simon Tokumine, Dasha Valter, Vijay Vasudevan, Kiran Vodrahalli, Xuezhi Wang, Pidong Wang, ZiRui Wang, Tao Wang, John Wieting, Yuhuai Wu, Kelvin Xu, Yunhan Xu, Linting Xue, Pengcheng Yin, Jiahui Yu, Qiao Zhang, Steven Zheng, Ce Zheng, Weikang Zhou, Denny Zhou, Slav Petrov, Yonghui Wu

Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.

 Ranked #1 on Question Answering on TriviaQA (using extra training data)

Language Modelling Question Answering

N-Grammer: Augmenting Transformers with latent n-grams

2 code implementations13 Jul 2022 Aurko Roy, Rohan Anil, Guangda Lai, Benjamin Lee, Jeffrey Zhao, Shuyuan Zhang, Shibo Wang, Ye Zhang, Shen Wu, Rigel Swavely, Tao, Yu, Phuong Dao, Christopher Fifty, Zhifeng Chen, Yonghui Wu

Transformer models have recently emerged as one of the foundational models in natural language processing, and as a byproduct, there is significant recent interest and investment in scaling these models.

Common Sense Reasoning Coreference Resolution +5

Hurdles to Progress in Long-form Question Answering

2 code implementations NAACL 2021 Kalpesh Krishna, Aurko Roy, Mohit Iyyer

The task of long-form question answering (LFQA) involves retrieving documents relevant to a given question and using them to generate a paragraph-length answer.

Long Form Question Answering Open-Domain Dialog +2

Efficient Content-Based Sparse Attention with Routing Transformers

2 code implementations12 Mar 2020 Aurko Roy, Mohammad Saffar, Ashish Vaswani, David Grangier

This work builds upon two lines of research: it combines the modeling flexibility of prior work on content-based sparse attention with the efficiency gains from approaches based on local, temporal sparse attention.

Ranked #5 on Image Generation on ImageNet 64x64 (Bits per dim metric)

Image Generation Language Modelling

Unsupervised Paraphrasing without Translation

no code implementations ACL 2019 Aurko Roy, David Grangier

We compare with MT-based approaches on paraphrase identification, generation, and training augmentation.

Machine Translation Paraphrase Identification +1

Towards a better understanding of Vector Quantized Autoencoders

no code implementations ICLR 2019 Aurko Roy, Ashish Vaswani, Niki Parmar, Arvind Neelakantan

Deep neural networks with discrete latent variables offer the promise of better symbolic reasoning, and learning abstractions that are more useful to new tasks.

Knowledge Distillation Machine Translation +1

Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer

7 code implementations ICLR 2019 David Berthelot, Colin Raffel, Aurko Roy, Ian Goodfellow

Autoencoders provide a powerful framework for learning compressed representations by encoding all of the information needed to reconstruct a data point in a latent code.

Theory and Experiments on Vector Quantized Autoencoders

2 code implementations28 May 2018 Aurko Roy, Ashish Vaswani, Arvind Neelakantan, Niki Parmar

Deep neural networks with discrete latent variables offer the promise of better symbolic reasoning, and learning abstractions that are more useful to new tasks.

Image Generation Knowledge Distillation +2

Fast Decoding in Sequence Models using Discrete Latent Variables

no code implementations ICML 2018 Łukasz Kaiser, Aurko Roy, Ashish Vaswani, Niki Parmar, Samy Bengio, Jakob Uszkoreit, Noam Shazeer

Finally, we evaluate our model end-to-end on the task of neural machine translation, where it is an order of magnitude faster at decoding than comparable autoregressive models.

Machine Translation Translation

Thermometer Encoding: One Hot Way To Resist Adversarial Examples

no code implementations ICLR 2018 Jacob Buckman, Aurko Roy, Colin Raffel, Ian Goodfellow

It is well known that it is possible to construct "adversarial examples" for neural networks: inputs which are misclassified by the network yet indistinguishable from true data.

Adversarial Patch

10 code implementations27 Dec 2017 Tom B. Brown, Dandelion Mané, Aurko Roy, Martín Abadi, Justin Gilmer

We present a method to create universal, robust, targeted adversarial image patches in the real world.

Reinforcement Learning under Model Mismatch

no code implementations NeurIPS 2017 Aurko Roy, Huan Xu, Sebastian Pokutta

We study reinforcement learning under model misspecification, where we do not have access to the true environment but only to a reasonably close approximation to it.

Q-Learning reinforcement-learning +1

Hierarchical Clustering via Spreading Metrics

no code implementations NeurIPS 2016 Aurko Roy, Sebastian Pokutta

We also prove that our algorithm returns an $O(\log{n})$-approximate hierarchical clustering for a generalization of this cost function also studied in [arXiv:1510. 05043].

Clustering graph partitioning

Cannot find the paper you are looking for? You can Submit a new open access paper.