Search Results for author: James Bradbury

Found 16 papers, 10 papers with code

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

no code implementations NA 2021 Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor, Irina Higgins, Antonia Creswell, Nat McAleese, Amy Wu, Erich Elsen, Siddhant Jayakumar, Elena Buchatskaya, David Budden, Esme Sutherland, Karen Simonyan, Michela Paganini, Laurent SIfre, Lena Martens, Xiang Lorraine Li, Adhiguna Kuncoro, Aida Nematzadeh, Elena Gribovskaya, Domenic Donato, Angeliki Lazaridou, Arthur Mensch, Jean-Baptiste Lespiau, Maria Tsimpoukelli, Nikolai Grigorev, Doug Fritz, Thibault Sottiaux, Mantas Pajarskas, Toby Pohlen, Zhitao Gong, Daniel Toyama, Cyprien de Masson d'Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew Johnson, Blake Hechtman, Laura Weidinger, Iason Gabriel, William Isaac, Ed Lockhart, Simon Osindero, Laura Rimell, Chris Dyer, Oriol Vinyals, Kareem Ayoub, Jeff Stanway, Lorrayne Bennett, Demis Hassabis, Koray Kavukcuoglu, Geoffrey Irving

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.

Fact Checking Language Modelling +3

Exploring the limits of Concurrency in ML Training on Google TPUs

no code implementations7 Nov 2020 Sameer Kumar, James Bradbury, Cliff Young, Yu Emma Wang, Anselm Levskaya, Blake Hechtman, Dehao Chen, HyoukJoong Lee, Mehmet Deveci, Naveen Kumar, Pankaj Kanwar, Shibo Wang, Skye Wanderman-Milne, Steve Lacy, Tao Wang, Tayo Oguntebi, Yazhou Zu, Yuanzhong Xu, Andy Swing

Recent results in language understanding using neural networks have required training hardware of unprecedentedscale, with thousands of chips cooperating on a single training run.

A High-Quality Multilingual Dataset for Structured Documentation Translation

1 code implementation WS 2019 Kazuma Hashimoto, Raffaella Buschiazzo, James Bradbury, Teresa Marshall, Richard Socher, Caiming Xiong

We build and evaluate translation models for seven target languages from English, with several different copy mechanisms and an XML-constrained beam search.

Translation

Block-diagonal Hessian-free Optimization for Training Neural Networks

no code implementations ICLR 2018 Huishuai Zhang, Caiming Xiong, James Bradbury, Richard Socher

Second-order methods for neural network optimization have several advantages over methods based on first-order gradient descent, including better scaling to large mini-batch sizes and fewer updates needed for convergence.

Second-order methods

Towards Neural Machine Translation with Latent Tree Attention

no code implementations WS 2017 James Bradbury, Richard Socher

Building models that take advantage of the hierarchical structure of language without a priori annotation is a longstanding goal in natural language processing.

Machine Translation reinforcement-learning +1

Quasi-Recurrent Neural Networks

8 code implementations5 Nov 2016 James Bradbury, Stephen Merity, Caiming Xiong, Richard Socher

Recurrent neural networks are a powerful tool for modeling sequential data, but the dependence of each timestep's computation on the previous timestep's output limits parallelism and makes RNNs unwieldy for very long sequences.

Language Modelling Machine Translation +3

Pointer Sentinel Mixture Models

6 code implementations26 Sep 2016 Stephen Merity, Caiming Xiong, James Bradbury, Richard Socher

Recent neural network sequence models with softmax classifiers have achieved their best language modeling performance only with very large hidden states and large vocabularies.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.