The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks.
Such applications demand prediction models with small storage and computational complexity that do not compromise significantly on accuracy.
This paper develops a novel tree-based algorithm, called Bonsai, for efficient prediction on IoT devices – such as those based on the Arduino Uno board having an 8 bit ATmega328P microcontroller operating at 16 MHz with no native floating point support, 2 KB RAM and 32 KB read-only flash.
We present a neural encoder-decoder model to convert images into presentational markup based on a scalable coarse-to-fine attention mechanism.
We introduce a novel theoretical analysis of recurrent networks based on Gersgorin's circle theorem that illuminates several modeling and optimization issues and improves our understanding of the LSTM cell.
#8 best model for Language Modelling on Text8
As a case study, we implement a new system, Certigrad, for optimizing over stochastic computation graphs, and we generate a formal (i. e. machine-checkable) proof that the gradients sampled by the system are unbiased estimates of the true mathematical gradients.