no code implementations • NeurIPS 2007 • Geoffrey E. Hinton, Ruslan R. Salakhutdinov
We show how to use unlabeled data and a deep belief net (DBN) to learn a good covariance kernel for a Gaussian process.
no code implementations • NeurIPS 2008 • Iain Murray, Ruslan R. Salakhutdinov
We present a simple new Monte Carlo algorithm for evaluating probabilities of observations in complex latent variable models, such as Deep Belief Networks.
no code implementations • NeurIPS 2009 • Geoffrey E. Hinton, Ruslan R. Salakhutdinov
Each member of the family models the probability distribution of documents of a specific length as a product of topic-specific distributions rather than as a mixture and this gives much better generalization than Latent Dirichlet Allocation for modeling the log probabilities of held-out documents.
no code implementations • NeurIPS 2009 • Ruslan R. Salakhutdinov
Markov random fields (MRFs), or undirected graphical models, provide a powerful framework for modeling complex dependencies among random variables.
no code implementations • NeurIPS 2009 • Ilya Sutskever, Joshua B. Tenenbaum, Ruslan R. Salakhutdinov
We consider the problem of learning probabilistic models for complex relational structures between various types of objects.
no code implementations • NeurIPS 2010 • Jason D. Lee, Ben Recht, Nathan Srebro, Joel Tropp, Ruslan R. Salakhutdinov
The max-norm was proposed as a convex matrix regularizer by Srebro et al (2004) and was shown to be empirically superior to the trace-norm for collaborative filtering problems.
no code implementations • NeurIPS 2010 • Nathan Srebro, Ruslan R. Salakhutdinov
We show that matrix completion with trace-norm regularization can be significantly hurt when entries of the matrix are sampled non-uniformly, but that a properly weighted version of the trace-norm regularizer works well with non-uniform sampling.
no code implementations • NeurIPS 2011 • Antonio Torralba, Joshua B. Tenenbaum, Ruslan R. Salakhutdinov
We introduce HD (or ``Hierarchical-Deep'') models, a new compositional learning architecture that integrates deep learning models with structured hierarchical Bayesian models.
no code implementations • NeurIPS 2011 • Rina Foygel, Ohad Shamir, Nati Srebro, Ruslan R. Salakhutdinov
We provide rigorous guarantees on learning with the weighted trace-norm under arbitrary sampling distributions.
11 code implementations • 3 Jul 2012 • Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, Ruslan R. Salakhutdinov
When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data.
Ranked #205 on Image Classification on CIFAR-10
no code implementations • NeurIPS 2012 • Rina Foygel, Nathan Srebro, Ruslan R. Salakhutdinov
We introduce a new family of matrix norms, the ''local max'' norms, generalizing existing methods such as the max norm, the trace norm (nuclear norm), and the weighted or smoothed weighted trace norms, which have been extensively used in the literature as regularizers for matrix reconstruction problems.
no code implementations • NeurIPS 2012 • Mohammad Norouzi, David J. Fleet, Ruslan R. Salakhutdinov
Motivated by large-scale multimedia applications we propose to learn mappings from high-dimensional data to binary codes that preserve semantic similarity.
no code implementations • NeurIPS 2012 • Kevin Swersky, Ilya Sutskever, Daniel Tarlow, Richard S. Zemel, Ruslan R. Salakhutdinov, Ryan P. Adams
The Restricted Boltzmann Machine (RBM) is a popular density model that is also good for extracting features.
no code implementations • NeurIPS 2012 • Geoffrey E. Hinton, Ruslan R. Salakhutdinov
We describe how the pre-training algorithm for Deep Boltzmann Machines (DBMs) is related to the pre-training algorithm for Deep Belief Networks and we show that under certain conditions, the pre-training procedure improves the variational lower bound of a two-hidden-layer DBM.
no code implementations • NeurIPS 2012 • Nitish Srivastava, Ruslan R. Salakhutdinov
Our experimental results on bi-modal data consisting of images and text show that the Multimodal DBM can learn a good generative model of the joint space of image and text inputs that is useful for information retrieval from both unimodal and multimodal queries.
no code implementations • 26 Sep 2013 • Nitish Srivastava, Ruslan R. Salakhutdinov, Geoffrey E. Hinton
We introduce a Deep Boltzmann Machine model suitable for modeling and extracting latent semantic representations from a large unstructured collection of documents.
no code implementations • NeurIPS 2013 • Brenden M. Lake, Ruslan R. Salakhutdinov, Josh Tenenbaum
People can learn a new visual class from just one example, yet machine learning algorithms typically require hundreds or thousands of examples to tackle the same problems.
no code implementations • NeurIPS 2013 • Nitish Srivastava, Ruslan R. Salakhutdinov
The tree structure can be used to impose a generative prior over classification parameters.
Ranked #183 on Image Classification on CIFAR-100
no code implementations • NeurIPS 2013 • Roger B. Grosse, Chris J. Maddison, Ruslan R. Salakhutdinov
Many powerful Monte Carlo techniques for estimating partition functions, such as annealed importance sampling (AIS), are based on sampling from a sequence of intermediate distributions which interpolate between a tractable initial distribution and an intractable target distribution.
no code implementations • NeurIPS 2013 • Yichuan Tang, Ruslan R. Salakhutdinov
As regressors, MLPs model the conditional distribution of the predictor variables Y given the input variables X.
no code implementations • NeurIPS 2018 • Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan R. Salakhutdinov, Aarti Singh
We show that for an $m$-dimensional convolutional filter with linear activation acting on a $d$-dimensional input, the sample complexity of achieving population prediction error of $\epsilon$ is $\widetilde{O(m/\epsilon^2)$, whereas the sample-complexity for its FNN counterpart is lower bounded by $\Omega(d/\epsilon^2)$ samples.
no code implementations • NeurIPS 2018 • Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan R. Salakhutdinov, Yann Lecun
We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden units), or embedding-free units such as image pixels.