Deep Equilibrium Models

NeurIPS 2019 Shaojie BaiJ. Zico KolterVladlen Koltun

We present a new approach to modeling sequential data: the deep equilibrium model (DEQ). Motivated by an observation that the hidden layers of many existing deep sequence models converge towards some fixed point, we propose the DEQ approach that directly finds these equilibrium points via root-finding... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT LEADERBOARD
Language Modelling Penn Treebank (Word Level) DEQ-TrellisNet Test perplexity 57.1 # 24
Params 24M # 1
Language Modelling WikiText-103 DEQ-TrellisNet Test perplexity 29.0 # 17
Number of params 180M # 1
Language Modelling WikiText-103 DEQ-Transformer (small) Test perplexity 32.4 # 22
Number of params 138M # 1
Language Modelling WikiText-103 DEQ-Transformer (medium, adaptive embed) Test perplexity 23.2 # 14
Number of params 110M # 1