Search Results for author: Diego Granziol

Found 18 papers, 5 papers with code

A Practical PAC-Bayes Generalisation Bound for Deep Learning

no code implementations29 Sep 2021 Diego Granziol, Mingtian Zhang, Nicholas Baskerville

Under a PAC-Bayesian framework, we derive an implementation efficient parameterisation invariant metric to measure the difference between our true and empirical risk.

Appearance of Random Matrix Theory in Deep Learning

1 code implementation12 Feb 2021 Nicholas P Baskerville, Diego Granziol, Jonathan P Keating

We further investigate the importance of the true loss surface in neural networks and find, in contrast to previous work, that the exponential hardness of locating the global minimum has practical consequences for achieving state of the art performance.

Flatness is a Flase Friend

no code implementations1 Jan 2021 Diego Granziol

Hessian based measures of flatness, such as the trace, Frobenius and spectral norms, have been argued, used and shown to relate to generalisation.

A Random Matrix Theory Approach to Damping in Deep Learning

no code implementations15 Nov 2020 Diego Granziol, Nicholas Baskerville

We conjecture that the inherent difference in generalisation between adaptive and non-adaptive gradient methods in deep learning stems from the increased estimation noise in the flattest directions of the true loss surface.

Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training

1 code implementation16 Jun 2020 Diego Granziol, Stefan Zohren, Stephen Roberts

Whilst the linear scaling for stochastic gradient descent has been derived under more restrictive conditions, which we generalise, the square root scaling rule for adaptive optimisers is, to our knowledge, completely novel.

Second-order methods

Flatness is a False Friend

no code implementations16 Jun 2020 Diego Granziol

Hessian based measures of flatness, such as the trace, Frobenius and spectral norms, have been argued, used and shown to relate to generalisation.

Beyond Random Matrix Theory for Deep Networks

no code implementations13 Jun 2020 Diego Granziol

We investigate whether the Wigner semi-circle and Marcenko-Pastur distributions, often used for deep neural network theoretical analysis, match empirically observed spectral densities.

Iterative Averaging in the Quest for Best Test Error

no code implementations2 Mar 2020 Diego Granziol, Xingchen Wan, Samuel Albanie, Stephen Roberts

We analyse and explain the increased generalisation performance of iterate averaging using a Gaussian process perturbation model between the true and batch risk surface on the high dimensional quadratic.

Image Classification

Towards understanding the true loss surface of deep neural networks using random matrix theory and iterative spectral methods

no code implementations ICLR 2020 Diego Granziol, Timur Garipov, Dmitry Vetrov, Stefan Zohren, Stephen Roberts, Andrew Gordon Wilson

This approach is an order of magnitude faster than state-of-the-art methods for spectral visualization, and can be generically used to investigate the spectral properties of matrices in deep learning.

Deep Curvature Suite

1 code implementation20 Dec 2019 Diego Granziol, Xingchen Wan, Timur Garipov

We present MLRG Deep Curvature suite, a PyTorch-based, open-source package for analysis and visualisation of neural network curvature and loss landscape.

Misconceptions

A Maximum Entropy approach to Massive Graph Spectra

no code implementations19 Dec 2019 Diego Granziol, Robin Ru, Stefan Zohren, Xiaowen Dong, Michael Osborne, Stephen Roberts

Graph spectral techniques for measuring graph similarity, or for learning the cluster number, require kernel smoothing.

Graph Similarity

Entropic Spectral Learning for Large-Scale Graphs

no code implementations18 Apr 2018 Diego Granziol, Binxin Ru, Stefan Zohren, Xiaowen Dong, Michael Osborne, Stephen Roberts

Graph spectra have been successfully used to classify network types, compute the similarity between graphs, and determine the number of communities in a network.

Community Detection

Fast Information-theoretic Bayesian Optimisation

1 code implementation ICML 2018 Binxin Ru, Mark McLeod, Diego Granziol, Michael A. Osborne

Information-theoretic Bayesian optimisation techniques have demonstrated state-of-the-art performance in tackling important global optimisation problems.

Bayesian Optimisation

Entropic Determinants

no code implementations8 Sep 2017 Diego Granziol, Stephen Roberts

The ability of many powerful machine learning algorithms to deal with large data sets without compromise is often hampered by computationally expensive linear algebra tasks, of which calculating the log determinant is a canonical example.

Entropic Trace Estimates for Log Determinants

1 code implementation24 Apr 2017 Jack Fitzsimons, Diego Granziol, Kurt Cutajar, Michael Osborne, Maurizio Filippone, Stephen Roberts

The scalable calculation of matrix determinants has been a bottleneck to the widespread application of many machine learning methods such as determinantal point processes, Gaussian processes, generalised Markov random fields, graph models and many others.

Gaussian Processes Point Processes

Cannot find the paper you are looking for? You can Submit a new open access paper.