Search Results for author: David J. Fleet

Found 39 papers, 19 papers with code

A Personalized Video-Based Hand Taxonomy: Application for Individuals with Spinal Cord Injury

no code implementations26 Mar 2024 Mehdy Dousty, David J. Fleet, José Zariffa

A comprehensive evaluation of function in home and community settings requires a hand grasp taxonomy for individuals with impaired hand function.

Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model

no code implementations20 Dec 2023 Saurabh Saxena, Junhwa Hur, Charles Herrmann, Deqing Sun, David J. Fleet

In contrast, we advocate a generic, task-agnostic diffusion model, with several advancements such as log-scale depth parameterization to enable joint modeling of indoor and outdoor scenes, conditioning on the field-of-view (FOV) to handle scale ambiguity and synthetically augmenting FOV during training to generalize beyond the limited camera intrinsics in training datasets.

Denoising Monocular Depth Estimation

Synthetic Data from Diffusion Models Improves ImageNet Classification

no code implementations17 Apr 2023 Shekoofeh Azizi, Simon Kornblith, Chitwan Saharia, Mohammad Norouzi, David J. Fleet

Deep generative models are becoming increasingly powerful, now generating diverse high fidelity photo-realistic samples given text prompts.

Classification Data Augmentation

Monocular Depth Estimation using Diffusion Models

no code implementations28 Feb 2023 Saurabh Saxena, Abhishek Kar, Mohammad Norouzi, David J. Fleet

To cope with the limited availability of data for supervised training, we leverage pre-training on self-supervised image-to-image translation tasks.

Ranked #22 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)

Denoising Image-to-Image Translation +3

RobustNeRF: Ignoring Distractors with Robust Losses

1 code implementation CVPR 2023 Sara Sabour, Suhani Vora, Daniel Duckworth, Ivan Krasin, David J. Fleet, Andrea Tagliasacchi

To cope with distractors, we advocate a form of robust estimation for NeRF training, modeling distractors in training data as outliers of an optimization problem.

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting

no code implementations CVPR 2023 Su Wang, Chitwan Saharia, Ceslee Montgomery, Jordi Pont-Tuset, Shai Noy, Stefano Pellegrini, Yasumasa Onoe, Sarah Laszlo, David J. Fleet, Radu Soricut, Jason Baldridge, Mohammad Norouzi, Peter Anderson, William Chan

Through extensive human evaluation on EditBench, we find that object-masking during training leads to across-the-board improvements in text-image alignment -- such that Imagen Editor is preferred over DALL-E 2 and Stable Diffusion -- and, as a cohort, these models are better at object-rendering than text-rendering, and handle material/color/size attributes better than count/shape attributes.

Image Inpainting Object +1

Gaussian-Bernoulli RBMs Without Tears

1 code implementation19 Oct 2022 Renjie Liao, Simon Kornblith, Mengye Ren, David J. Fleet, Geoffrey Hinton

We revisit the challenging problem of training Gaussian-Bernoulli restricted Boltzmann machines (GRBMs), introducing two innovations.

A Unified Sequence Interface for Vision Tasks

1 code implementation15 Jun 2022 Ting Chen, Saurabh Saxena, Lala Li, Tsung-Yi Lin, David J. Fleet, Geoffrey Hinton

Despite that, by formulating the output of each task as a sequence of discrete tokens with a unified interface, we show that one can train a neural network with a single model architecture and loss function on all these tasks, with no task-specific customization.

Image Captioning Instance Segmentation +2

Residual Multiplicative Filter Networks for Multiscale Reconstruction

1 code implementation1 Jun 2022 Shayan shekarforoush, David B. Lindell, David J. Fleet, Marcus A. Brubaker

Coordinate networks like Multiplicative Filter Networks (MFNs) and BACON offer some control over the frequency spectrum used to represent continuous signals such as images or 3D volumes.

Palette: Image-to-Image Diffusion Models

5 code implementations10 Nov 2021 Chitwan Saharia, William Chan, Huiwen Chang, Chris A. Lee, Jonathan Ho, Tim Salimans, David J. Fleet, Mohammad Norouzi

We expect this standardized evaluation protocol to play a role in advancing image-to-image translation research.

Colorization Denoising +5

Cascaded Diffusion Models for High Fidelity Image Generation

no code implementations30 May 2021 Jonathan Ho, Chitwan Saharia, William Chan, David J. Fleet, Mohammad Norouzi, Tim Salimans

We show that cascaded diffusion models are capable of generating high fidelity images on the class-conditional ImageNet generation benchmark, without any assistance from auxiliary image classifiers to boost sample quality.

Data Augmentation Image Generation +2

Bridging the Gap Between Adversarial Robustness and Optimization Bias

1 code implementation17 Feb 2021 Fartash Faghri, Sven Gowal, Cristina Vasconcelos, David J. Fleet, Fabian Pedregosa, Nicolas Le Roux

We demonstrate that the choice of optimizer, neural network architecture, and regularizer significantly affect the adversarial robustness of linear neural networks, providing guarantees without the need for adversarial training.

Adversarial Robustness

A Study of Gradient Variance in Deep Learning

1 code implementation9 Jul 2020 Fartash Faghri, David Duvenaud, David J. Fleet, Jimmy Ba

We introduce a method, Gradient Clustering, to minimize the variance of average mini-batch gradient with stratified sampling.


Exemplar VAE: Linking Generative Models, Nearest Neighbor Retrieval, and Data Augmentation

1 code implementation NeurIPS 2020 Sajad Norouzi, David J. Fleet, Mohammad Norouzi

We introduce Exemplar VAEs, a family of generative models that bridge the gap between parametric and non-parametric, exemplar based generative models.

Data Augmentation Density Estimation +3

SentenceMIM: A Latent Variable Language Model

1 code implementation18 Feb 2020 Micha Livne, Kevin Swersky, David J. Fleet

MIM learning encourages high mutual information between observations and latent variables, and is robust against posterior collapse.

 Ranked #1 on Question Answering on YahooCQA (using extra training data)

Language Modelling Question Answering +1

MIM: Mutual Information Machine

1 code implementation8 Oct 2019 Micha Livne, Kevin Swersky, David J. Fleet

Experiments show that MIM learns representations with high mutual information, consistent encoding and decoding distributions, effective latent clustering, and data log likelihood comparable to VAE, while avoiding posterior collapse.

Clustering Decoder

High Mutual Information in Representation Learning with Symmetric Variational Inference

no code implementations4 Oct 2019 Micha Livne, Kevin Swersky, David J. Fleet

We introduce the Mutual Information Machine (MIM), a novel formulation of representation learning, using a joint distribution over the observations and latent state in an encoder/decoder framework.

Decoder Representation Learning +2

Walking on Thin Air: Environment-Free Physics-based Markerless Motion Capture

no code implementations4 Dec 2018 Micha Livne, Leonid Sigal, Marcus A. Brubaker, David J. Fleet

To our knowledge, this is the first approach to take physics into account without explicit {\em a priori} knowledge of the environment or body dimensions.

Markerless Motion Capture

TzK Flow - Conditional Generative Model

no code implementations5 Nov 2018 Micha Livne, David J. Fleet

Unlike autoencoders, the bottleneck does not limit model expressiveness, similar to flow-based ML.

Transductive Log Opinion Pool of Gaussian Process Experts

no code implementations24 Nov 2015 Yanshuai Cao, David J. Fleet

We introduce a framework for analyzing transductive combination of Gaussian process (GP) experts, where independently trained GP experts are combined in a way that depends on test point location, in order to scale GPs to big data.

Adversarial Manipulation of Deep Representations

2 code implementations16 Nov 2015 Sara Sabour, Yanshuai Cao, Fartash Faghri, David J. Fleet

We show that the representation of an image in a deep neural network (DNN) can be manipulated to mimic those of other natural images, with only minor, imperceptible perturbations to the original image.

Efficient non-greedy optimization of decision trees

no code implementations NeurIPS 2015 Mohammad Norouzi, Maxwell D. Collins, Matthew Johnson, David J. Fleet, Pushmeet Kohli

In this paper, we present an algorithm for optimizing the split functions at all levels of the tree jointly with the leaf parameters, based on a global objective.

Structured Prediction

CO2 Forest: Improved Random Forest by Continuous Optimization of Oblique Splits

no code implementations19 Jun 2015 Mohammad Norouzi, Maxwell D. Collins, David J. Fleet, Pushmeet Kohli

We develop a convex-concave upper bound on the classification loss for a one-level decision tree, and optimize the bound by stochastic gradient descent at each internal node of the tree.

General Classification Multi-class Classification

Building Proteins in a Day: Efficient 3D Molecular Reconstruction

no code implementations CVPR 2015 Marcus A. Brubaker, Ali Punjani, David J. Fleet

A new framework for estimation is introduced which relies on modern stochastic optimization techniques to scale to large datasets.

3D Reconstruction Stochastic Optimization

Generalized Product of Experts for Automatic and Principled Fusion of Gaussian Process Predictions

no code implementations28 Oct 2014 Yanshuai Cao, David J. Fleet

In this work, we propose a generalized product of experts (gPoE) framework for combining the predictions of multiple probabilistic models.

Gaussian Processes valid

Efficient Optimization for Sparse Gaussian Process Regression

no code implementations NeurIPS 2013 Yanshuai Cao, Marcus A. Brubaker, David J. Fleet, Aaron Hertzmann

We propose an efficient optimization algorithm for selecting a subset of training data to induce sparsity for Gaussian process regression.


Fast Exact Search in Hamming Space with Multi-Index Hashing

2 code implementations11 Jul 2013 Mohammad Norouzi, Ali Punjani, David J. Fleet

There is growing interest in representing image data and feature descriptors using compact binary codes for fast near neighbor search.

Cartesian K-Means

1 code implementation CVPR 2013 Mohammad Norouzi, David J. Fleet

A fundamental limitation of quantization techniques like the k-means clustering algorithm is the storage and runtime cost associated with the large numbers of clusters required to keep quantization errors small and model fidelity high.

Clustering Object Recognition +2

Hamming Distance Metric Learning

no code implementations NeurIPS 2012 Mohammad Norouzi, David J. Fleet, Ruslan R. Salakhutdinov

Motivated by large-scale multimedia applications we propose to learn mappings from high-dimensional data to binary codes that preserve semantic similarity.

General Classification Metric Learning +3

Cannot find the paper you are looking for? You can Submit a new open access paper.