Search Results for author: Justin Johnson

This interaction field guides the sampling of an object-conditioned human motion diffusion model, so as to encourage plausible contacts and affordance semantics.

Motion Synthesis valid

Paper
Add Code

Learning to Predict Scene-Level Implicit 3D from Posed RGBD Data

no code implementations • CVPR 2023 • Nilesh Kulkarni, Linyi Jin, Justin Johnson, David F. Fouhey

We introduce a method that can learn to predict scene-level implicit functions for 3D reconstruction from posed RGBD data.

3D Reconstruction

Paper
Add Code

Scalable 3D Captioning with Pretrained Models

1 code implementation • NeurIPS 2023 • Tiange Luo, Chris Rockwell, Honglak Lee, Justin Johnson

We introduce Cap3D, an automatic approach for generating descriptive text for 3D objects.

Descriptive Image Captioning +2

180

Paper
Code

Hyperbolic Image-Text Representations

1 code implementation • 18 Apr 2023 • Karan Desai, Maximilian Nickel, Tanmay Rajpurohit, Justin Johnson, Ramakrishna Vedantam

Visual and linguistic concepts naturally organize themselves in a hierarchy, where a textual concept "dog" entails all images that contain dogs.

Image Classification Retrieval +1

110

Paper
Code

Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models

1 code implementation • ICCV 2023 • Lukas Höllein, Ang Cao, Andrew Owens, Justin Johnson, Matthias Nießner

We present Text2Room, a method for generating room-scale textured 3D meshes from a given text prompt as input.

Text to 3D

969

Paper
Code

Learning Visual Representations via Language-Guided Sampling

1 code implementation • CVPR 2023 • Mohamed El Banani, Karan Desai, Justin Johnson

Our approach diverges from image-based contrastive learning by sampling view pairs using language similarity instead of hand-crafted augmentations or learned clusters.

Contrastive Learning Representation Learning

139

Paper
Code

Text-To-4D Dynamic Scene Generation

no code implementations • 26 Jan 2023 • Uriel Singer, Shelly Sheynin, Adam Polyak, Oron Ashual, Iurii Makarov, Filippos Kokkinos, Naman Goyal, Andrea Vedaldi, Devi Parikh, Justin Johnson, Yaniv Taigman

We present MAV3D (Make-A-Video3D), a method for generating three-dimensional dynamic scenes from text descriptions.

Scene Generation

Paper
Add Code

HexPlane: A Fast Representation for Dynamic Scenes

1 code implementation • CVPR 2023 • Ang Cao, Justin Johnson

HexPlane is a simple and effective solution for representing 4D volumes, and we hope they can broadly contribute to modeling spacetime for dynamic 3D scenes.

Novel View Synthesis

225

Paper
Code

Multiview Compressive Coding for 3D Reconstruction

1 code implementation • CVPR 2023 • Chao-yuan Wu, Justin Johnson, Jitendra Malik, Christoph Feichtenhofer, Georgia Gkioxari

We introduce a simple framework that operates on 3D points of single objects or whole scenes coupled with category-agnostic large-scale training from diverse RGB-D videos.

Ranked #2 on Single-View 3D Reconstruction on Common Objects in 3D

3D Reconstruction Self-Supervised Learning +1

601

Paper
Code

Neural Shape Compiler: A Unified Framework for Transforming between Text, Point Cloud, and Program

no code implementations • 25 Dec 2022 • Tiange Luo, Honglak Lee, Justin Johnson

On Text2Shape, ShapeGlot, ABO, Genre, and Program Synthetic datasets, Neural Shape Compiler shows strengths in $\textit{Text}$ $\Longrightarrow$ $\textit{Point Cloud}$, $\textit{Point Cloud}$ $\Longrightarrow$ $\textit{Text}$, $\textit{Point Cloud}$ $\Longrightarrow$ $\textit{Program}$, and Point Cloud Completion tasks.

Point Cloud Completion

Paper
Add Code

Self-Supervised Correspondence Estimation via Multiview Registration

1 code implementation • 6 Dec 2022 • Mohamed El Banani, Ignacio Rocco, David Novotny, Andrea Vedaldi, Natalia Neverova, Justin Johnson, Benjamin Graham

To address this, we propose a self-supervised approach for correspondence estimation that learns from multiview consistency in short RGB-D video sequences.

Paper
Code

RGB no more: Minimally-decoded JPEG Vision Transformers

1 code implementation • CVPR 2023 • Jeongsoo Park, Justin Johnson

However, these RGB images are commonly encoded in JPEG before saving to disk; decoding them imposes an unavoidable overhead for RGB networks.

Data Augmentation

Paper
Code

The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs

no code implementations • 18 Aug 2022 • Chris Rockwell, Justin Johnson, David F. Fouhey

We present a simple baseline for directly estimating the relative pose (rotation and translation, including scale) between two images.

Inductive Bias Pose Prediction +1

Paper
Add Code

Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild

1 code implementation • CVPR 2023 • Garrick Brazil, Abhinav Kumar, Julian Straub, Nikhila Ravi, Justin Johnson, Georgia Gkioxari

In 3D, existing benchmarks are small in size and approaches specialize in few object categories and specific domains, e. g. urban driving scenes.

Ranked #8 on 3D Object Detection From Monocular Images on KITTI-360

3D Object Detection 3D Object Detection From Monocular Images +2

661

Paper
Code

FWD: Real-time Novel View Synthesis with Forward Warping and Depth

1 code implementation • CVPR 2022 • Ang Cao, Chris Rockwell, Justin Johnson

Novel view synthesis (NVS) is a challenging task requiring systems to generate photorealistic images of scenes from new viewpoints, where both quality and speed are important for applications.

Novel View Synthesis

Paper
Code

Learning 3D Object Shape and Layout without 3D Supervision

no code implementations • CVPR 2022 • Georgia Gkioxari, Nikhila Ravi, Justin Johnson

A 3D scene consists of a set of objects, each with a shape and a layout giving their position in space.

Object

Paper
Add Code

What's Behind the Couch? Directed Ray Distance Functions (DRDF) for 3D Scene Reconstruction

no code implementations • 8 Dec 2021 • Nilesh Kulkarni, Justin Johnson, David F. Fouhey

We present an approach for full 3D scene reconstruction from a single unseen image.

3D Reconstruction 3D Scene Reconstruction

Paper
Add Code

Recognizing Scenes from Novel Viewpoints

no code implementations • 2 Dec 2021 • Shengyi Qian, Alexander Kirillov, Nikhila Ravi, Devendra Singh Chaplot, Justin Johnson, David F. Fouhey, Georgia Gkioxari

Humans can perceive scenes in 3D from a handful of 2D views.

Scene Recognition

Paper
Add Code

StyleMesh: Style Transfer for Indoor 3D Scene Reconstructions

1 code implementation • CVPR 2022 • Lukas Höllein, Justin Johnson, Matthias Nießner

Style transfer typically operates on 2D images, making stylization of a mesh challenging.

Style Transfer

130

Paper
Code

RedCaps: web-curated image-text data created by the people, for the people

1 code implementation • 22 Nov 2021 • Karan Desai, Gaurav Kaul, Zubin Aysola, Justin Johnson

We introduce RedCaps -- a large-scale dataset of 12M image-text pairs collected from Reddit.

Paper
Code

PixelSynth: Generating a 3D-Consistent Experience from a Single Image

1 code implementation • ICCV 2021 • Chris Rockwell, David F. Fouhey, Justin Johnson

Recent advancements in differentiable rendering and 3D reasoning have driven exciting results in novel view synthesis from a single image.

Novel View Synthesis

111

Paper
Code

Inverting and Understanding Object Detectors

1 code implementation • 26 Jun 2021 • Ang Cao, Justin Johnson

As a core problem in computer vision, the performance of object detection has improved drastically in the past few years.

Object object-detection +1

Paper
Code

Bootstrap Your Own Correspondences

no code implementations • ICCV 2021 • Mohamed El Banani, Justin Johnson

Our approach combines classic ideas from point cloud registration with more recent representation learning approaches.

Point Cloud Registration Representation Learning

Paper
Add Code

Rethinking "Batch" in BatchNorm

1 code implementation • 17 May 2021 • Yuxin Wu, Justin Johnson

BatchNorm is a critical building block in modern convolutional neural networks.

28,608

Paper
Code

UnsupervisedR&R: Unsupervised Point Cloud Registration via Differentiable Rendering

1 code implementation • CVPR 2021 • Mohamed El Banani, Luya Gao, Justin Johnson

Aligning partial views of a scene into a single whole is essential to understanding one's environment and is a key component of numerous robotics tasks such as SLAM and SfM.

Point Cloud Registration

134

Paper
Code

CASTing Your Model: Learning to Localize Improves Self-Supervised Representations

no code implementations • CVPR 2021 • Ramprasaath R. Selvaraju, Karan Desai, Justin Johnson, Nikhil Naik

Recent advances in self-supervised learning (SSL) have largely closed the gap with supervised ImageNet pretraining.

Self-Supervised Learning Visual Grounding

Paper
Add Code

Accelerating 3D Deep Learning with PyTorch3D

3 code implementations • 16 Jul 2020 • Nikhila Ravi, Jeremy Reizenstein, David Novotny, Taylor Gordon, Wan-Yen Lo, Justin Johnson, Georgia Gkioxari

We address these challenges by introducing PyTorch3D, a library of modular, efficient, and differentiable operators for 3D deep learning.

Autonomous Vehicles

8,262

Paper
Code

VirTex: Learning Visual Representations from Textual Annotations

3 code implementations • CVPR 2021 • Karan Desai, Justin Johnson

The de-facto approach to many vision tasks is to start from pretrained visual representations, typically learned via supervised training on ImageNet.

Ranked #1 on Object Detection on COCO test-dev (Hardware Burden metric)

General Classification Image Captioning +5

555

Paper
Code

SynSin: End-to-end View Synthesis from a Single Image

3 code implementations • CVPR 2020 • Olivia Wiles, Georgia Gkioxari, Richard Szeliski, Justin Johnson

Single image view synthesis allows for the generation of new views of a scene given a single input image.

Novel View Synthesis

8,262

Paper
Code

Temporal Reasoning via Audio Question Answering

1 code implementation • 21 Nov 2019 • Haytham M. Fayek, Justin Johnson

In this paper, we use the task of Audio Question Answering (AQA) to study the temporal reasoning abilities of machine learning models.

Ranked #1 on Audio Question Answering on DAQA

Audio Question Answering Question Answering +3

Paper
Code

PHYRE: A New Benchmark for Physical Reasoning

2 code implementations • NeurIPS 2019 • Anton Bakhtin, Laurens van der Maaten, Justin Johnson, Laura Gustafson, Ross Girshick

The benchmark is designed to encourage the development of learning algorithms that are sample-efficient and generalize well across puzzles.

Ranked #3 on Visual Reasoning on PHYRE-1B-Within

Visual Reasoning

425

Paper
Code

Mesh R-CNN

6 code implementations • ICCV 2019 • Georgia Gkioxari, Jitendra Malik, Justin Johnson

We propose a system that detects objects in real-world images and produces a triangle mesh giving the full 3D shape of each detected object.

Ranked #1 on 3D Shape Modeling on Pix3D S2

3D Shape Modeling

8,262

Paper
Code

On Network Design Spaces for Visual Recognition

4 code implementations • ICCV 2019 • Ilija Radosavovic, Justin Johnson, Saining Xie, Wan-Yen Lo, Piotr Dollár

Compared to current methodologies of comparing point and curve estimates of model families, distribution estimates paint a more complete picture of the entire design landscape.

Neural Architecture Search

2,109

Paper
Code

HiDDeN: Hiding Data With Deep Networks

6 code implementations • ECCV 2018 • Jiren Zhu, Russell Kaplan, Justin Johnson, Li Fei-Fei

We show that these encodings are competitive with existing data hiding algorithms, and further that they can be made robust to noise: our models learn to reconstruct hidden information in an encoded image despite the presence of Gaussian blurring, pixel-wise dropout, cropping, and JPEG compression.

292

Paper
Code

Image Generation from Scene Graphs

4 code implementations • CVPR 2018 • Justin Johnson, Agrim Gupta, Li Fei-Fei

To overcome this limitation we propose a method for generating images from scene graphs, enabling explicitly reasoning about objects and their relationships.

Ranked #4 on Layout-to-Image Generation on Visual Genome 64x64

Image Generation from Scene Graphs Layout-to-Image Generation

1,286

Paper
Code

DDRprog: A CLEVR Differentiable Dynamic Reasoning Programmer

no code implementations • ICLR 2018 • Joseph Suarez, Justin Johnson, Fei-Fei Li

We present a novel Dynamic Differentiable Reasoning (DDR) framework for jointly learning branching programs and the functions composing them; this resolves a significant nondifferentiability inhibiting recent dynamic architectures.

Ranked #8 on Visual Question Answering (VQA) on CLEVR

Question Answering Visual Question Answering

Paper
Add Code

Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks

7 code implementations • CVPR 2018 • Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savarese, Alexandre Alahi

Understanding human motion behavior is critical for autonomous moving platforms (like self-driving cars and social robots) if they are to navigate human-centric environments.

Ranked #4 on Trajectory Prediction on ETH

Collision Avoidance Motion Forecasting +4

790

Paper
Code

Inferring and Executing Programs for Visual Reasoning

5 code implementations • ICCV 2017 • Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick

Existing methods for visual reasoning attempt to directly map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes.

Ranked #5 on Visual Question Answering (VQA) on CLEVR-Humans

Visual Question Answering (VQA) Visual Reasoning

792

Paper
Code

Characterizing and Improving Stability in Neural Style Transfer

no code implementations • ICCV 2017 • Agrim Gupta, Justin Johnson, Alexandre Alahi, Li Fei-Fei

Recent progress in style transfer on images has focused on improving the quality of stylized images and speed of methods.

Optical Flow Estimation Style Transfer +1

Paper
Add Code

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

5 code implementations • CVPR 2017 • Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick

When building artificial intelligence systems that can reason and answer questions about visual data, we need diagnostic tests to analyze our progress and discover shortcomings.

Question Answering Visual Question Answering +1

296

Paper
Code

A Hierarchical Approach for Generating Descriptive Image Paragraphs

3 code implementations • CVPR 2017 • Jonathan Krause, Justin Johnson, Ranjay Krishna, Li Fei-Fei

Recent progress on image captioning has made it possible to generate novel sentences describing images in natural language, but compressing an image into a single sentence can describe visual content in only coarse detail.

Ranked #7 on Image Paragraph Captioning on Image Paragraph Captioning

Dense Captioning Descriptive +3

Paper
Code

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

80 code implementations • 27 Mar 2016 • Justin Johnson, Alexandre Alahi, Li Fei-Fei

We consider image transformation problems, where an input image is transformed into an output image.

Ranked #4 on Nuclear Segmentation on Cell17

Image Super-Resolution Nuclear Segmentation +2

11,839

Paper
Code

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

1 code implementation • 23 Feb 2016 • Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A. Shamma, Michael S. Bernstein, Fei-Fei Li

Despite progress in perceptual tasks such as image classification, computers still perform poorly on cognitive tasks such as image description and question answering.

Image Classification Question Answering

204

Paper
Code

DenseCap: Fully Convolutional Localization Networks for Dense Captioning

1 code implementation • CVPR 2016 • Justin Johnson, Andrej Karpathy, Li Fei-Fei

We introduce the dense captioning task, which requires a computer vision system to both localize and describe salient regions in images in natural language.

Ranked #3 on Object Detection on Visual Genome

Dense Captioning Image Captioning +4

1,562

Paper
Code

Love Thy Neighbors: Image Annotation by Exploiting Image Metadata

no code implementations • ICCV 2015 • Justin Johnson, Lamberto Ballan, Fei-Fei Li

Some images that are difficult to recognize on their own may become more clear in the context of a neighborhood of related images with similar social-network metadata.

Paper
Add Code

Visualizing and Understanding Recurrent Networks

3 code implementations • 5 Jun 2015 • Andrej Karpathy, Justin Johnson, Li Fei-Fei

Recurrent Neural Networks (RNNs), and specifically a variant with Long Short-Term Memory (LSTM), are enjoying renewed interest as a result of successful applications in a wide range of machine learning problems that involve sequential data.

Paper
Code

Image Retrieval Using Scene Graphs

no code implementations • CVPR 2015 • Justin Johnson, Ranjay Krishna, Michael Stark, Li-Jia Li, David Shamma, Michael Bernstein, Li Fei-Fei

We introduce a novel dataset of 5, 000 human-generated scene graphs grounded to images and use this dataset to evaluate our method for image retrieval.

Image Retrieval Object Localization +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.