Search Results for author: Alexander A. Alemi

Found 29 papers, 12 papers with code

Text Segmentation based on Semantic Word Embeddings

1 code implementation18 Mar 2015 Alexander A. Alemi, Paul Ginsparg

We explore the use of semantic word embeddings in text segmentation algorithms, including the C99 segmentation algorithm and new algorithms inspired by the distributed word vector representation.

Segmentation Text Segmentation +1

Canonical Sectors and Evolution of Firms in the US Stock Markets

1 code implementation20 Mar 2015 Lorien X. Hayden, Ricky Chachra, Alexander A. Alemi, Paul H. Ginsparg, James P. Sethna

A classification of companies into sectors of the economy is important for macroeconomic analysis and for investments into the sector-specific financial indices and exchange traded funds (ETFs).

Classification

Deep Variational Information Bottleneck

9 code implementations1 Dec 2016 Alexander A. Alemi, Ian Fischer, Joshua V. Dillon, Kevin Murphy

We present a variational approximation to the information bottleneck of Tishby et al. (1999).

Adversarial Attack

Improved generator objectives for GANs

no code implementations8 Dec 2016 Ben Poole, Alexander A. Alemi, Jascha Sohl-Dickstein, Anelia Angelova

We present a framework to understand GAN training as alternating density ratio estimation and approximate divergence minimization.

Density Ratio Estimation

Jeffrey's prior sampling of deep sigmoidal networks

no code implementations25 May 2017 Lorien X. Hayden, Alexander A. Alemi, Paul H. Ginsparg, James P. Sethna

Neural networks have been shown to have a remarkable ability to uncover low dimensional structure in data: the space of possible reconstructed images form a reduced model manifold in image space.

Denoising

Fixing a Broken ELBO

1 code implementation ICML 2018 Alexander A. Alemi, Ben Poole, Ian Fischer, Joshua V. Dillon, Rif A. Saurous, Kevin Murphy

Recent work in unsupervised representation learning has focused on learning deep directed latent-variable models.

Representation Learning

GILBO: One Metric to Measure Them All

1 code implementation NeurIPS 2018 Alexander A. Alemi, Ian Fischer

We propose a simple, tractable lower bound on the mutual information contained in the joint generative density of any latent variable generative model: the GILBO (Generative Information Lower BOund).

Uncertainty in the Variational Information Bottleneck

no code implementations2 Jul 2018 Alexander A. Alemi, Ian Fischer, Joshua V. Dillon

We present a simple case study, demonstrating that Variational Information Bottleneck (VIB) can improve a network's classification calibration as well as its ability to detect out-of-distribution data.

General Classification

TherML: Thermodynamics of Machine Learning

no code implementations11 Jul 2018 Alexander A. Alemi, Ian Fischer

In this work we offer a framework for reasoning about a wide class of existing objectives in machine learning.

BIG-bench Machine Learning

TherML: The Thermodynamics of Machine Learning

no code implementations27 Sep 2018 Alexander A. Alemi, Ian Fischer

In this work we offer an information-theoretic framework for representation learning that connects with a wide class of existing objectives in machine learning.

BIG-bench Machine Learning Representation Learning

WAIC, but Why? Generative Ensembles for Robust Anomaly Detection

1 code implementation2 Oct 2018 Hyunsun Choi, Eric Jang, Alexander A. Alemi

Machine learning models encounter Out-of-Distribution (OoD) errors when the data seen at test time are generated from a different stochastic generator than the one used to generate the training data.

Anomaly Detection Out of Distribution (OOD) Detection

$β$-VAEs can retain label information even at high compression

no code implementations6 Dec 2018 Emily Fertig, Aryan Arbabi, Alexander A. Alemi

In this paper, we investigate the degree to which the encoding of a $\beta$-VAE captures label information across multiple architectures on Binary Static MNIST and Omniglot.

Vocal Bursts Intensity Prediction

On the Use of ArXiv as a Dataset

1 code implementation30 Apr 2019 Colin B. Clement, Matthew Bierbaum, Kevin P. O'Keeffe, Alexander A. Alemi

We use this pipeline to extract and analyze a 6. 7 million edge citation graph, with an 11 billion word corpus of full-text research articles.

Author Attribution Benchmarking +9

On Variational Bounds of Mutual Information

3 code implementations16 May 2019 Ben Poole, Sherjil Ozair, Aaron van den Oord, Alexander A. Alemi, George Tucker

Estimating and optimizing Mutual Information (MI) is core to many problems in machine learning; however, bounding MI in high dimensions is challenging.

Representation Learning

On Predictive Information Sub-optimality of RNNs

no code implementations25 Sep 2019 Zhe Dong, Deniz Oktay, Ben Poole, Alexander A. Alemi

Certain biological neurons demonstrate a remarkable capability to optimally compress the history of sensory inputs while being maximally informative about the future.

Information Plane

On Predictive Information in RNNs

no code implementations21 Oct 2019 Zhe Dong, Deniz Oktay, Ben Poole, Alexander A. Alemi

Certain biological neurons demonstrate a remarkable capability to optimally compress the history of sensory inputs while being maximally informative about the future.

Information Plane

Variational Predictive Information Bottleneck

no code implementations pproximateinference AABI Symposium 2019 Alexander A. Alemi

In classic papers, Zellner demonstrated that Bayesian inference could be derived as the solution to an information theoretic functional.

Bayesian Inference

Information in Infinite Ensembles of Infinitely-Wide Neural Networks

1 code implementation pproximateinference AABI Symposium 2019 Ravid Shwartz-Ziv, Alexander A. Alemi

In this preliminary work, we study the generalization properties of infinite ensembles of infinitely-wide neural networks.

CEB Improves Model Robustness

1 code implementation13 Feb 2020 Ian Fischer, Alexander A. Alemi

We demonstrate that the Conditional Entropy Bottleneck (CEB) can improve model robustness.

Adversarial Robustness Data Augmentation

Density of States Estimation for Out-of-Distribution Detection

no code implementations16 Jun 2020 Warren R. Morningstar, Cusuh Ham, Andrew G. Gallagher, Balaji Lakshminarayanan, Alexander A. Alemi, Joshua V. Dillon

Drawing on the statistical physics notion of ``density of states,'' the DoSE decision rule avoids direct comparison of model probabilities, and instead utilizes the ``probability of the model probability,'' or indeed the frequency of any reasonable statistic.

Out-of-Distribution Detection Out of Distribution (OOD) Detection +1

PAC$^m$-Bayes: Narrowing the Empirical Risk Gap in the Misspecified Bayesian Regime

no code implementations19 Oct 2020 Warren R. Morningstar, Alexander A. Alemi, Joshua V. Dillon

The Bayesian posterior minimizes the "inferential risk" which itself bounds the "predictive risk".

Does Knowledge Distillation Really Work?

2 code implementations NeurIPS 2021 Samuel Stanton, Pavel Izmailov, Polina Kirichenko, Alexander A. Alemi, Andrew Gordon Wilson

Knowledge distillation is a popular technique for training a small student network to emulate a larger teacher model, such as an ensemble of networks.

Knowledge Distillation

Bayesian Imitation Learning for End-to-End Mobile Manipulation

no code implementations15 Feb 2022 Yuqing Du, Daniel Ho, Alexander A. Alemi, Eric Jang, Mohi Khansari

In this work we investigate and demonstrate benefits of a Bayesian approach to imitation learning from multiple sensor inputs, as applied to the task of opening office doors with a mobile manipulator.

Imitation Learning

Weighted Ensemble Self-Supervised Learning

no code implementations18 Nov 2022 Yangjun Ruan, Saurabh Singh, Warren Morningstar, Alexander A. Alemi, Sergey Ioffe, Ian Fischer, Joshua V. Dillon

Ensembling has proven to be a powerful technique for boosting model performance, uncertainty estimation, and robustness in supervised learning.

Self-Supervised Learning

Variational Prediction

no code implementations14 Jul 2023 Alexander A. Alemi, Ben Poole

In this paper, we present variational prediction, a technique for directly learning a variational approximation to the posterior predictive distribution using a variational bound.

Bayesian Inference

Speed Limits for Deep Learning

no code implementations27 Jul 2023 Inbar Seroussi, Alexander A. Alemi, Moritz Helias, Zohar Ringel

State-of-the-art neural networks require extreme computational power to train.

Cannot find the paper you are looking for? You can Submit a new open access paper.