Understanding which inductive biases could be helpful for the unsupervised learning of object-centric representations of natural scenes is challenging.
Image super-resolution (SR) techniques are used to generate a high-resolution image from a low-resolution image.
We assess the explanation plausibility in terms of identification and localization, by comparing model-selected with dermatologist-selected explanations, and gradient-weighted class-activation maps with dermatologist explanation maps.
In this work we extend a message passing neural network designed specifically for predicting properties of molecules and materials with a calibrated probabilistic predictive distribution.
By training 240 representations and over 10, 000 reinforcement learning (RL) policies on a simulated robotic setup, we evaluate to what extent different properties of pretrained VAE-based representations affect the OOD generalization of downstream agents.
The idea behind object-centric representation learning is that natural scenes can better be modeled as compositions of objects and their relations as opposed to distributed representations.
Learning data representations that are useful for various downstream tasks is a cornerstone of artificial intelligence.
Empirically, for the training of both continuous and discrete generative models, the proposed method yields superior variance reduction, resulting in an SNR for IWAE that increases with $K$ without relying on the reparameterization trick.
Learning meaningful representations that disentangle the underlying structure of the data generating process is considered to be of key importance in machine learning.
This paper introduces novel results for the score function gradient estimator of the importance weighted variational bound (IWAE).
Normalizing flows and variational autoencoders are powerful generative models that can represent complicated density functions.
In this paper we close the performance gap by constructing VAE models that can effectively utilize a deep hierarchy of stochastic variables and model complex covariance structures.
Ranked #17 on Image Generation on ImageNet 32x32 (bpd metric)
We believe our proposed architecture can be used on many real life information extraction tasks where word classification cannot be used due to a lack of the required word-level labels.
Recording atomic-resolution transmission electron microscopy (TEM) images is becoming increasingly routine.
There have been multiple attempts with variational auto-encoders (VAE) to learn powerful global representations of complex data using a combination of latent stochastic variables and an autoregressive model over the dimensions of the data.
Humans possess an ability to abstractly reason about objects and their interactions, an ability not shared with state-of-the-art deep learning models.
We achieve state of the art results on the bAbI textual question-answering dataset with the recurrent relational network, consistently solving 20/20 tasks.
Ranked #3 on Question Answering on bAbi (Mean Error Rate metric)
This paper takes a step towards temporal reasoning in a dynamically changing video, not in the pixel space that constitutes its frames, but in a latent space that describes the non-linear dynamics of the objects in its world.
We describe a recurrent neural network model that can capture long range context and compare it to a baseline logistic regression model corresponding to the current CloudScan production system.
Deep generative models trained with large amounts of unlabelled data have proven to be powerful within the domain of unsupervised learning.
Most existing Neural Machine Translation models use groups of characters or whole words as their unit of input and output.
Our approach extends the framework of (generalized) approximate message passing -- assumes zero-mean iid entries of the measurement matrix -- to a general class of random matrix ensembles.
The auxiliary variables leave the generative model unchanged but make the variational distribution more expressive.
Ranked #46 on Image Classification on SVHN
We present an autoencoder that leverages learned representations to better measure similarities in data space.
We investigate different down-sampling factors (ratio of pixel in input and output) for the SPN and show that the RNN-SPN model is able to down-sample the input images without deteriorating performance.
In this work, we address the problem of solving a series of underdetermined linear inverse problems subject to a sparsity constraint.
We are interested in solving the multiple measurement vector (MMV) problem for instances, where the underlying sparsity pattern exhibit spatio-temporal structure motivated by the electroencephalogram (EEG) source localization problem.
Applying traditional collaborative filtering to digital publishing is challenging because user data is very sparse due to the high volume of documents relative to the number of users.
Recurrent neural networks are an generalization of the feed forward neural network that naturally handle sequential data.
The future predictive performance of a Bayesian model can be estimated using Bayesian cross-validation.
A perturbative expansion is made of the exact but intractable correction, and can be applied to the model's partition function and other moments of interest.
In this paper we present a novel approach to learn directed acyclic graphs (DAG) and factor models within the same framework while also allowing for model comparison between them.