Search Results for author: DeLesley Hutchins

Found 3 papers, 3 papers with code

Memorizing Transformers

3 code implementations ICLR 2022 Yuhuai Wu, Markus N. Rabe, DeLesley Hutchins, Christian Szegedy

Language models typically need to be trained or finetuned in order to acquire new knowledge, which involves updating their weights.

Language Modelling Math

Block-Recurrent Transformers

3 code implementations11 Mar 2022 DeLesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur

It is merely a transformer layer: it uses self-attention and cross-attention to efficiently compute a recurrent function over a large set of state vectors and tokens.

Language Modelling

Deep Learning with Dynamic Computation Graphs

2 code implementations7 Feb 2017 Moshe Looks, Marcello Herreshoff, DeLesley Hutchins, Peter Norvig

However, since the computation graph has a different shape and size for every input, such networks do not directly support batched training or inference.

Cannot find the paper you are looking for? You can Submit a new open access paper.