Declarative nets that are equilibrium models

Implicit layers are computational modules that output the solution to some problem depending on the input and the layer parameters. The deep equilibrium model (DEQ) outputs a solution to a fixed point equation. On the other hand, deep declarative networks (DDNs) solve an optimisation problem in their forward pass, an arguably more intuitive, interpretable problem than finding a fixed point. We show that using a kernelised generalised linear model (kGLM) as an inner problem in a DDN yields a large class of commonly used DEQ architectures with a closed-form expression for the hidden layer parameters in terms of the kernel. The activation functions have interpretations in terms of the derivative of the log partition function and the Bayesian prior or kernel regulariser. Building on existing literature, we interpret DEQs as fine-tuned, unrolled classical algorithms, giving an intuitive justification for why DEQ models are sensible. We use our theoretical result to devise an initialisation scheme for DEQs that allows them to solve kGLMs in their forward pass at initialisation. We empirically show that this initialisation scheme improves training stability and performance over random initialisation.

PDF Abstract
No code implementations yet. Submit your code now



  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.