Search Results for author: Ruslan R. Salakhutdinov

Found 22 papers, 1 papers with code

Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes

no code implementations • NeurIPS 2007 • Geoffrey E. Hinton, Ruslan R. Salakhutdinov

We show how to use unlabeled data and a deep belief net (DBN) to learn a good covariance kernel for a Gaussian process.

Gaussian Processes General Classification +1

Paper
Add Code

Evaluating probabilities under high-dimensional latent variable models

no code implementations • NeurIPS 2008 • Iain Murray, Ruslan R. Salakhutdinov

We present a simple new Monte Carlo algorithm for evaluating probabilities of observations in complex latent variable models, such as Deep Belief Networks.

Vocal Bursts Intensity Prediction

Paper
Add Code

Replicated Softmax: an Undirected Topic Model

no code implementations • NeurIPS 2009 • Geoffrey E. Hinton, Ruslan R. Salakhutdinov

Each member of the family models the probability distribution of documents of a specific length as a product of topic-specific distributions rather than as a mixture and this gives much better generalization than Latent Dirichlet Allocation for modeling the log probabilities of held-out documents.

Paper
Add Code

Learning in Markov Random Fields using Tempered Transitions

no code implementations • NeurIPS 2009 • Ruslan R. Salakhutdinov

Markov random fields (MRFs), or undirected graphical models, provide a powerful framework for modeling complex dependencies among random variables.

Object Recognition

Paper
Add Code

Modelling Relational Data using Bayesian Clustered Tensor Factorization

no code implementations • NeurIPS 2009 • Ilya Sutskever, Joshua B. Tenenbaum, Ruslan R. Salakhutdinov

We consider the problem of learning probabilistic models for complex relational structures between various types of objects.

Clustering

Paper
Add Code

Practical Large-Scale Optimization for Max-norm Regularization

no code implementations • NeurIPS 2010 • Jason D. Lee, Ben Recht, Nathan Srebro, Joel Tropp, Ruslan R. Salakhutdinov

The max-norm was proposed as a convex matrix regularizer by Srebro et al (2004) and was shown to be empirically superior to the trace-norm for collaborative filtering problems.

Clustering Collaborative Filtering

Paper
Add Code

Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm

no code implementations • NeurIPS 2010 • Nathan Srebro, Ruslan R. Salakhutdinov

We show that matrix completion with trace-norm regularization can be significantly hurt when entries of the matrix are sampled non-uniformly, but that a properly weighted version of the trace-norm regularizer works well with non-uniform sampling.

Collaborative Filtering Matrix Completion

Paper
Add Code

Learning to Learn with Compound HD Models

no code implementations • NeurIPS 2011 • Antonio Torralba, Joshua B. Tenenbaum, Ruslan R. Salakhutdinov

We introduce HD (or ``Hierarchical-Deep'') models, a new compositional learning architecture that integrates deep learning models with structured hierarchical Bayesian models.

Novel Concepts Object Recognition

Paper
Add Code

Learning with the weighted trace-norm under arbitrary sampling distributions

no code implementations • NeurIPS 2011 • Rina Foygel, Ohad Shamir, Nati Srebro, Ruslan R. Salakhutdinov

We provide rigorous guarantees on learning with the weighted trace-norm under arbitrary sampling distributions.

Paper
Add Code

Improving neural networks by preventing co-adaptation of feature detectors

11 code implementations • 3 Jul 2012 • Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, Ruslan R. Salakhutdinov

When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data.

Ranked #205 on Image Classification on CIFAR-10

Image Classification Object Recognition

246

Paper
Code

Matrix reconstruction with the local max norm

no code implementations • NeurIPS 2012 • Rina Foygel, Nathan Srebro, Ruslan R. Salakhutdinov

We introduce a new family of matrix norms, the ''local max'' norms, generalizing existing methods such as the max norm, the trace norm (nuclear norm), and the weighted or smoothed weighted trace norms, which have been extensively used in the literature as regularizers for matrix reconstruction problems.

Paper
Add Code

Hamming Distance Metric Learning

no code implementations • NeurIPS 2012 • Mohammad Norouzi, David J. Fleet, Ruslan R. Salakhutdinov

Motivated by large-scale multimedia applications we propose to learn mappings from high-dimensional data to binary codes that preserve semantic similarity.

General Classification Metric Learning +3

Paper
Add Code

Cardinality Restricted Boltzmann Machines

no code implementations • NeurIPS 2012 • Kevin Swersky, Ilya Sutskever, Daniel Tarlow, Richard S. Zemel, Ruslan R. Salakhutdinov, Ryan P. Adams

The Restricted Boltzmann Machine (RBM) is a popular density model that is also good for extracting features.

Paper
Add Code

A Better Way to Pretrain Deep Boltzmann Machines

no code implementations • NeurIPS 2012 • Geoffrey E. Hinton, Ruslan R. Salakhutdinov

We describe how the pre-training algorithm for Deep Boltzmann Machines (DBMs) is related to the pre-training algorithm for Deep Belief Networks and we show that under certain conditions, the pre-training procedure improves the variational lower bound of a two-hidden-layer DBM.

Paper
Add Code

Multimodal Learning with Deep Boltzmann Machines

no code implementations • NeurIPS 2012 • Nitish Srivastava, Ruslan R. Salakhutdinov

Our experimental results on bi-modal data consisting of images and text show that the Multimodal DBM can learn a good generative model of the joint space of image and text inputs that is useful for information retrieval from both unimodal and multimodal queries.

Information Retrieval Retrieval +2

Paper
Add Code

Modeling Documents with Deep Boltzmann Machines

no code implementations • 26 Sep 2013 • Nitish Srivastava, Ruslan R. Salakhutdinov, Geoffrey E. Hinton

We introduce a Deep Boltzmann Machine model suitable for modeling and extracting latent semantic representations from a large unstructured collection of documents.

Document Classification General Classification +1

Paper
Add Code

One-shot learning by inverting a compositional causal process

no code implementations • NeurIPS 2013 • Brenden M. Lake, Ruslan R. Salakhutdinov, Josh Tenenbaum

People can learn a new visual class from just one example, yet machine learning algorithms typically require hundreds or thousands of examples to tackle the same problems.

General Classification One-Shot Learning

Paper
Add Code

Discriminative Transfer Learning with Tree-based Priors

no code implementations • NeurIPS 2013 • Nitish Srivastava, Ruslan R. Salakhutdinov

The tree structure can be used to impose a generative prior over classification parameters.

Ranked #183 on Image Classification on CIFAR-100

Classification General Classification +2

Paper
Add Code

Annealing between distributions by averaging moments

no code implementations • NeurIPS 2013 • Roger B. Grosse, Chris J. Maddison, Ruslan R. Salakhutdinov

Many powerful Monte Carlo techniques for estimating partition functions, such as annealed importance sampling (AIS), are based on sampling from a sequence of intermediate distributions which interpolate between a tractable initial distribution and an intractable target distribution.

Paper
Add Code

Learning Stochastic Feedforward Neural Networks

no code implementations • NeurIPS 2013 • Yichuan Tang, Ruslan R. Salakhutdinov

As regressors, MLPs model the conditional distribution of the predictor variables Y given the input variables X.

General Classification Structured Prediction

Paper
Add Code

How Many Samples are Needed to Estimate a Convolutional Neural Network?

no code implementations • NeurIPS 2018 • Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan R. Salakhutdinov, Aarti Singh

We show that for an $m$-dimensional convolutional filter with linear activation acting on a $d$-dimensional input, the sample complexity of achieving population prediction error of $\epsilon$ is $\widetilde{O(m/\epsilon^2)$, whereas the sample-complexity for its FNN counterpart is lower bounded by $\Omega(d/\epsilon^2)$ samples.

LEMMA

Paper
Add Code

GLoMo: Unsupervised Learning of Transferable Relational Graphs

no code implementations • NeurIPS 2018 • Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan R. Salakhutdinov, Yann Lecun

We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden units), or embedding-free units such as image pixels.

Image Classification Natural Language Inference +4

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.