1 code implementation • 26 Apr 2024 • Stephen Zhao, Rob Brekelmans, Alireza Makhzani, Roger Grosse
Numerous capability and safety techniques of Large Language Models (LLMs), including RLHF, automated red-teaming, prompt engineering, and infilling, can be cast as sampling from an unnormalized target distribution defined by a given reward or potential function over the full sequence.
2 code implementations • 5 Feb 2024 • Wu Lin, Felix Dangel, Runa Eschenhagen, Juhan Bae, Richard E. Turner, Alireza Makhzani
Adaptive gradient optimizers like Adam(W) are the default training algorithms for many deep learning architectures, such as transformers.
2 code implementations • 9 Dec 2023 • Wu Lin, Felix Dangel, Runa Eschenhagen, Kirill Neklyudov, Agustinus Kristiadi, Richard E. Turner, Alireza Makhzani
Second-order methods such as KFAC can be useful for neural net training.
1 code implementation • 16 Oct 2023 • Kirill Neklyudov, Rob Brekelmans, Alexander Tong, Lazar Atanackovic, Qiang Liu, Alireza Makhzani
The dynamical formulation of the optimal transport can be extended through various choices of the underlying geometry (kinetic energy), and the regularization of density paths (potential energy).
1 code implementation • 16 May 2023 • Daniel Severo, James Townsend, Ashish Khisti, Alireza Makhzani
We present a one-shot method for compressing large labeled graphs called Random Edge Coding.
1 code implementation • ICLR 2022 • Rob Brekelmans, Sicong Huang, Marzyeh Ghassemi, Greg Ver Steeg, Roger Grosse, Alireza Makhzani
Since accurate estimation of MI without density information requires a sample size exponential in the true MI, we assume either a single marginal or the full joint density information is known.
2 code implementations • 19 Jan 2023 • Juan Carrasquilla, Mohamed Hibat-Allah, Estelle Inack, Alireza Makhzani, Kirill Neklyudov, Graham W. Taylor, Giacomo Torlai
Binary neural networks, i. e., neural networks whose parameters and activations are constrained to only two possible values, offer a compelling avenue for the deployment of deep learning models on energy- and memory-limited devices.
1 code implementation • 13 Oct 2022 • Kirill Neklyudov, Rob Brekelmans, Daniel Severo, Alireza Makhzani
Learning the continuous dynamics of a system from snapshots of its temporal marginals is a problem which appears throughout natural sciences and machine learning, including in quantum systems, single-cell biological data, and generative modeling.
1 code implementation • NeurIPS 2021 • Kuan-Chieh Wang, Yan Fu, Ke Li, Ashish Khisti, Richard Zemel, Alireza Makhzani
In this work, we provide a probabilistic interpretation of model inversion attacks, and formulate a variational objective that accounts for both diversity and accuracy.
1 code implementation • 15 Jul 2021 • Daniel Severo, James Townsend, Ashish Khisti, Alireza Makhzani, Karen Ullrich
Current methods which compress multisets at an optimal rate have computational complexity that scales linearly with alphabet size, making them too slow to be practical in many real-world settings.
1 code implementation • ICLR Workshop Neural_Compression 2021 • Yangjun Ruan, Karen Ullrich, Daniel Severo, James Townsend, Ashish Khisti, Arnaud Doucet, Alireza Makhzani, Chris J. Maddison
Naively applied, our schemes would require more initial bits than the standard bits-back coder, but we show how to drastically reduce this additional cost with couplings in the latent space.
no code implementations • NeurIPS Workshop DL-IG 2020 • Rob Brekelmans, Frank Nielsen, Alireza Makhzani, Aram Galstyan, Greg Ver Steeg
The exponential family is well known in machine learning and statistical physics as the maximum entropy distribution subject to a set of observed constraints, while the geometric mixture path is common in MCMC methods such as annealed importance sampling.
2 code implementations • ICML 2020 • Sicong Huang, Alireza Makhzani, Yanshuai Cao, Roger Grosse
The field of deep generative modeling has succeeded in producing astonishingly realistic-seeming images and audio, but quantitative evaluation remains a challenge.
no code implementations • ICLR 2019 • Alireza Makhzani
In this paper, we describe the "implicit autoencoder" (IAE), a generative autoencoder in which both the generative path and the recognition path are parametrized by implicit distributions.
11 code implementations • 16 Aug 2017 • Oriol Vinyals, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Küttler, John Agapiou, Julian Schrittwieser, John Quan, Stephen Gaffney, Stig Petersen, Karen Simonyan, Tom Schaul, Hado van Hasselt, David Silver, Timothy Lillicrap, Kevin Calderone, Paul Keet, Anthony Brunasso, David Lawrence, Anders Ekermo, Jacob Repp, Rodney Tsing
Finally, we present initial baseline results for canonical deep reinforcement learning agents applied to the StarCraft II domain.
Ranked #1 on Starcraft II on MoveToBeacon
1 code implementation • NeurIPS 2017 • Alireza Makhzani, Brendan Frey
In this paper, we describe the "PixelGAN autoencoder", a generative autoencoder in which the generative path is a convolutional autoregressive neural network on pixels (PixelCNN) that is conditioned on a latent code, and the recognition path uses a generative adversarial network (GAN) to impose a prior distribution on the latent code.
Ranked #10 on Unsupervised Image Classification on MNIST
29 code implementations • 18 Nov 2015 • Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, Brendan Frey
In this paper, we propose the "adversarial autoencoder" (AAE), which is a probabilistic autoencoder that uses the recently proposed generative adversarial networks (GAN) to perform variational inference by matching the aggregated posterior of the hidden code vector of the autoencoder with an arbitrary prior distribution.
Ranked #7 on Unsupervised MNIST on MNIST
1 code implementation • NeurIPS 2015 • Alireza Makhzani, Brendan Frey
In this paper, we propose a winner-take-all method for learning hierarchical sparse representations in an unsupervised fashion.
3 code implementations • 19 Dec 2013 • Alireza Makhzani, Brendan Frey
Recently, it has been observed that when representations are learnt in a way that encourages sparsity, improved performance is obtained on classification tasks.