Search Results for author: David F. Fouhey

Found 32 papers, 13 papers with code

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

1 code implementation21 Sep 2023 Jianing Yang, Xuweiyi Chen, Shengyi Qian, Nikhil Madaan, Madhavan Iyengar, David F. Fouhey, Joyce Chai

While existing approaches often rely on extensive labeled data or exhibit limitations in handling complex language queries, we propose LLM-Grounder, a novel zero-shot, open-vocabulary, Large Language Model (LLM)-based 3D visual grounding pipeline.

Language Modelling Large Language Model +3

Learning to Predict Scene-Level Implicit 3D from Posed RGBD Data

no code implementations CVPR 2023 Nilesh Kulkarni, Linyi Jin, Justin Johnson, David F. Fouhey

We introduce a method that can learn to predict scene-level implicit functions for 3D reconstruction from posed RGBD data.

3D Reconstruction

Understanding 3D Object Interaction from a Single Image

1 code implementation ICCV 2023 Shengyi Qian, David F. Fouhey

Humans can easily understand a single image as depicting multiple potential objects permitting interaction.

Object

MOVES: Manipulated Objects in Video Enable Segmentation

no code implementations CVPR 2023 Richard E. L. Higgins, David F. Fouhey

We present a method that uses manipulation to learn to understand the objects people hold and as well as hand-object contact.

Object Optical Flow Estimation

Large-Scale Spatial Cross-Calibration of Hinode/SOT-SP and SDO/HMI

no code implementations29 Sep 2022 David F. Fouhey, Richard E. L. Higgins, Spiro K. Antiochos, Graham Barnes, Marc L. DeRosa, J. Todd Hoeksema, K. D. Leka, Yang Liu, Peter W. Schuck, Tamas I. Gombosi

Second, analysis of over 12, 000 scans show that the pointing information is often incorrect by dozens of arcseconds with a strong bias.

The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs

no code implementations18 Aug 2022 Chris Rockwell, Justin Johnson, David F. Fouhey

We present a simple baseline for directly estimating the relative pose (rotation and translation, including scale) between two images.

Inductive Bias Pose Prediction +1

PlaneFormers: From Sparse View Planes to 3D Reconstruction

1 code implementation8 Aug 2022 Samir Agarwala, Linyi Jin, Chris Rockwell, David F. Fouhey

We present an approach for the planar surface reconstruction of a scene from images with limited overlap.

3D Reconstruction Surface Reconstruction

Sound Localization by Self-Supervised Time Delay Estimation

1 code implementation26 Apr 2022 Ziyang Chen, David F. Fouhey, Andrew Owens

We adapt the contrastive random walk of Jabri et al. to learn a cycle-consistent representation from unlabeled stereo sounds, resulting in a model that performs on par with supervised methods on "in the wild" internet recordings.

Contrastive Learning Visual Tracking

Understanding 3D Object Articulation in Internet Videos

no code implementations CVPR 2022 Shengyi Qian, Linyi Jin, Chris Rockwell, Siyi Chen, David F. Fouhey

We propose to investigate detecting and characterizing the 3D planar articulation of objects from ordinary videos.

Object

SynthIA: A Synthetic Inversion Approximation for the Stokes Vector Fusing SDO and Hinode into a Virtual Observatory

no code implementations27 Aug 2021 Richard E. L. Higgins, David F. Fouhey, Spiro K. Antiochos, Graham Barnes, Mark C. M. Cheung, J. Todd Hoeksema, KD Leka, Yang Liu, Peter W. Schuck, Tamas I. Gombosi

Both NASA's Solar Dynamics Observatory (SDO) and the JAXA/NASA Hinode mission include spectropolarimetric instruments designed to measure the photospheric magnetic field.

PixelSynth: Generating a 3D-Consistent Experience from a Single Image

1 code implementation ICCV 2021 Chris Rockwell, David F. Fouhey, Justin Johnson

Recent advancements in differentiable rendering and 3D reasoning have driven exciting results in novel view synthesis from a single image.

Novel View Synthesis

Collision Replay: What Does Bumping Into Things Tell You About Scene Geometry?

no code implementations3 May 2021 Alexander Raistrick, Nilesh Kulkarni, David F. Fouhey

At the heart of our approach is the idea of collision replay, where we use examples of a collision to provide supervision for observations at a past frame.

Fast and Accurate Emulation of the SDO/HMI Stokes Inversion with Uncertainty Quantification

1 code implementation31 Mar 2021 Richard E. L. Higgins, David F. Fouhey, Dichang Zhang, Spiro K. Antiochos, Graham Barnes, J. Todd Hoeksema, K. D. Leka, Yang Liu, Peter W. Schuck, Tamas I. Gombosi

The Helioseismic and Magnetic Imager (HMI) onboard NASA's Solar Dynamics Observatory (SDO) produces estimates of the photospheric magnetic field which are a critical input to many space weather modelling and forecasting systems.

Uncertainty Quantification

Planar Surface Reconstruction from Sparse Views

1 code implementation ICCV 2021 Linyi Jin, Shengyi Qian, Andrew Owens, David F. Fouhey

The paper studies planar surface reconstruction of indoor scenes from two views with unknown camera poses.

Surface Reconstruction

Full-Body Awareness from Partial Observations

no code implementations ECCV 2020 Chris Rockwell, David F. Fouhey

There has been great progress in human 3D mesh recovery and great interest in learning about the world from consumer video data.

Human Mesh Recovery

Associative3D: Volumetric Reconstruction from Sparse Views

1 code implementation ECCV 2020 Shengyi Qian, Linyi Jin, David F. Fouhey

This information is then jointly reasoned over to produce the most likely explanation of the scene.

3D Volumetric Reconstruction

Understanding Human Hands in Contact at Internet Scale

1 code implementation CVPR 2020 Dandan Shan, Jiaqi Geng, Michelle Shu, David F. Fouhey

Hands are the central means by which humans manipulate their world and being able to reliably extract hand state information from Internet videos of humans engaged in their hands has the potential to pave the way to systems that can learn from petabytes of video data.

Novel Object Viewpoint Estimation through Reconstruction Alignment

1 code implementation CVPR 2020 Mohamed El Banani, Jason J. Corso, David F. Fouhey

Our key insight is that although we do not have an explicit 3D model or a predefined canonical pose, we can still learn to estimate the object's shape in the viewer's frame and then use an image to provide our reference model or canonical pose.

Image-to-Image Translation Object +1

Articulation-aware Canonical Surface Mapping

1 code implementation CVPR 2020 Nilesh Kulkarni, Abhinav Gupta, David F. Fouhey, Shubham Tulsiani

We tackle the tasks of: 1) predicting a Canonical Surface Mapping (CSM) that indicates the mapping from 2D pixels to corresponding points on a canonical template shape, and 2) inferring the articulation and pose of the template corresponding to the input image.

From Lifestyle Vlogs to Everyday Interactions

no code implementations CVPR 2018 David F. Fouhey, Wei-cheng Kuo, Alexei A. Efros, Jitendra Malik

A major stumbling block to progress in understanding basic human interactions, such as getting out of bed or opening a refrigerator, is lack of good training data.

Future prediction

From Images to 3D Shape Attributes

no code implementations20 Dec 2016 David F. Fouhey, Abhinav Gupta, Andrew Zisserman

Our first objective is to infer these 3D shape attributes from a single image.

3D Shape Attributes

no code implementations CVPR 2016 David F. Fouhey, Abhinav Gupta, Andrew Zisserman

In this paper we investigate 3D attributes as a means to understand the shape of an object in a single image.

Object

Learning a Predictable and Generative Vector Representation for Objects

2 code implementations29 Mar 2016 Rohit Girdhar, David F. Fouhey, Mikel Rodriguez, Abhinav Gupta

The network consists of two components: (a) an autoencoder that ensures the representation is generative; and (b) a convolutional network that ensures the representation is predictable.

Retrieval

In Defense of the Direct Perception of Affordances

no code implementations5 May 2015 David F. Fouhey, Xiaolong Wang, Abhinav Gupta

The field of functional recognition or affordance estimation from images has seen a revival in recent years.

Designing Deep Networks for Surface Normal Estimation

no code implementations CVPR 2015 Xiaolong Wang, David F. Fouhey, Abhinav Gupta

We show by incorporating several constraints (man-made, manhattan world) and meaningful intermediate representations (room layout, edge labels) in the architecture leads to state of the art performance on surface normal estimation.

Scene Understanding Surface Normal Estimation

Predicting Object Dynamics in Scenes

no code implementations CVPR 2014 David F. Fouhey, C. L. Zitnick

Given a static scene, a human can trivially enumerate the myriad of things that can happen next and characterize the relative likelihood of each.

Attribute Object

Cannot find the paper you are looking for? You can Submit a new open access paper.