Search Results for author: Gregory Shakhnarovich

Found 29 papers, 12 papers with code

Transcrib3D: 3D Referring Expression Resolution through Large Language Models

no code implementations30 Apr 2024 Jiading Fang, Xiangshan Tan, Shengjie Lin, Igor Vasiljevic, Vitor Guizilini, Hongyuan Mei, Rares Ambrus, Gregory Shakhnarovich, Matthew R Walter

We introduce Transcrib3D, an approach that brings together 3D detection methods and the emergent reasoning capabilities of large language models (LLMs).

Referring Expression

6-DoF Stability Field via Diffusion Models

no code implementations26 Oct 2023 Takuma Yoneda, Tianchong Jiang, Gregory Shakhnarovich, Matthew R. Walter

A core capability for robot manipulation is reasoning over where and how to stably place objects in cluttered environments.

3D Pose Estimation motion prediction +2

NeRFuser: Large-Scale Scene Representation by NeRF Fusion

1 code implementation22 May 2023 Jiading Fang, Shengjie Lin, Igor Vasiljevic, Vitor Guizilini, Rares Ambrus, Adrien Gaidon, Gregory Shakhnarovich, Matthew R. Walter

A practical benefit of implicit visual representations like Neural Radiance Fields (NeRFs) is their memory efficiency: large scenes can be efficiently stored and shared as small neural nets instead of collections of images.

Classification Confidence Estimation with Test-Time Data-Augmentation

no code implementations30 Jun 2020 Yuval Bahat, Gregory Shakhnarovich

This suggests the task of detecting errors, which we tackle in this paper for the case of visual classification.

Classification Data Augmentation +1

Analysis of diversity-accuracy tradeoff in image captioning

2 code implementations27 Feb 2020 Ruotian Luo, Gregory Shakhnarovich

We investigate the effect of different model architectures, training objectives, hyperparameter settings and decoding procedures on the diversity of automatically generated image captions.

Image Captioning reinforcement-learning +1

DIODE: A Dense Indoor and Outdoor DEpth Dataset

2 code implementations1 Aug 2019 Igor Vasiljevic, Nick Kolkin, Shanyi Zhang, Ruotian Luo, Haochen Wang, Falcon Z. Dai, Andrea F. Daniele, Mohammadreza Mostajabi, Steven Basart, Matthew R. Walter, Gregory Shakhnarovich

We introduce DIODE, a dataset that contains thousands of diverse high resolution color images with accurate, dense, long-range depth measurements.

Natural and Adversarial Error Detection using Invariance to Image Transformations

no code implementations1 Feb 2019 Yuval Bahat, Michal Irani, Gregory Shakhnarovich

Our approach is based on the observation that correctly classified images tend to exhibit robust and consistent classifications under certain image transformations (e. g., horizontal flip, small image translation, etc.).


Regularizing Deep Networks by Modeling and Predicting Label Structure

no code implementations CVPR 2018 Mohammadreza Mostajabi, Michael Maire, Gregory Shakhnarovich

Our technique is applicable when the ground-truth labels themselves exhibit internal structure; we derive a regularizer by learning an autoencoder over the set of annotations.

Semantic Segmentation

Confidence from Invariance to Image Transformations

1 code implementation2 Apr 2018 Yuval Bahat, Gregory Shakhnarovich

We develop a technique for automatically detecting the classification errors of a pre-trained visual classifier.

General Classification Novelty Detection

Discriminability objective for training descriptive captions

1 code implementation CVPR 2018 Ruotian Luo, Brian Price, Scott Cohen, Gregory Shakhnarovich

One property that remains lacking in image captions generated by contemporary methods is discriminability: being able to tell two images apart given the caption for one of them.

Caption Generation Descriptive +1

Semantic speech retrieval with a visually grounded model of untranscribed speech

2 code implementations5 Oct 2017 Herman Kamper, Gregory Shakhnarovich, Karen Livescu

We introduce a newly collected data set of human semantic relevance judgements and an associated task, semantic speech retrieval, where the goal is to search for spoken utterances that are semantically relevant to a given text query.

Language Acquisition Retrieval

Training Deep Networks to be Spatially Sensitive

no code implementations ICCV 2017 Nicholas Kolkin, Gregory Shakhnarovich, Eli Shechtman

In many computer vision tasks, for example saliency prediction or semantic segmentation, the desired output is a foreground map that predicts pixels where some criteria is satisfied.

Saliency Prediction Semantic Segmentation

Visually grounded learning of keyword prediction from untranscribed speech

1 code implementation23 Mar 2017 Herman Kamper, Shane Settle, Gregory Shakhnarovich, Karen Livescu

In this setting of images paired with untranscribed spoken captions, we consider whether computer vision systems can be used to obtain textual labels for the speech.

Language Acquisition TAG

Comprehension-guided referring expressions

no code implementations CVPR 2017 Ruotian Luo, Gregory Shakhnarovich

Second, we use the comprehension module in a generate-and-rerank pipeline, which chooses from candidate expressions generated by a model according to their performance on the comprehension task.

Referring Expression Referring expression generation

Diverse Sampling for Self-Supervised Learning of Semantic Segmentation

no code implementations6 Dec 2016 Mohammadreza Mostajabi, Nicholas Kolkin, Gregory Shakhnarovich

We propose an approach for learning category-level semantic segmentation purely from image-level classification tags indicating presence of categories.

Classification General Classification +3

Examining the Impact of Blur on Recognition by Convolutional Networks

no code implementations17 Nov 2016 Igor Vasiljevic, Ayan Chakrabarti, Gregory Shakhnarovich

We investigate the extent to which this degradation is due to the mismatch between training and input image statistics.

Lexicon-Free Fingerspelling Recognition from Video: Data, Models, and Signer Adaptation

no code implementations26 Sep 2016 Taehwan Kim, Jonathan Keane, Weiran Wang, Hao Tang, Jason Riggle, Gregory Shakhnarovich, Diane Brentari, Karen Livescu

Recognizing fingerspelling is challenging for a number of reasons: It involves quick, small motions that are often highly coarticulated; it exhibits significant variation between signers; and there has been a dearth of continuous fingerspelling data collected.

FractalNet: Ultra-Deep Neural Networks without Residuals

4 code implementations24 May 2016 Gustav Larsson, Michael Maire, Gregory Shakhnarovich

We introduce a design strategy for neural network macro-architecture based on self-similarity.

Image Classification

Learning Representations for Automatic Colorization

3 code implementations22 Mar 2016 Gustav Larsson, Michael Maire, Gregory Shakhnarovich

This intermediate output can be used to automatically generate a color image, or further manipulated prior to image formation.

Colorization Image Colorization +1

Part Discovery from Partial Correspondence

no code implementations CVPR 2013 Subhransu Maji, Gregory Shakhnarovich

We study the problem of part discovery when partial correspondence between instances of a category are available.

object-detection Object Detection +1

Image Segmentation by Cascaded Region Agglomeration

no code implementations CVPR 2013 Zhile Ren, Gregory Shakhnarovich

We propose a hierarchical segmentation algorithm that starts with a very fine oversegmentation and gradually merges regions using a cascade of boundary classifiers.

Image Segmentation Segmentation +1

Sparse Coding for Learning Interpretable Spatio-Temporal Primitives

no code implementations NeurIPS 2010 Taehwan Kim, Gregory Shakhnarovich, Raquel Urtasun

Sparse coding has recently become a popular approach in computer vision to learn dictionaries of natural images.

Cannot find the paper you are looking for? You can Submit a new open access paper.