no code implementations • 5 Mar 2024 • Chris Rockwell, Nilesh Kulkarni, Linyi Jin, Jeong Joon Park, Justin Johnson, David F. Fouhey
Estimating relative camera poses between images has been a central problem in computer vision.
1 code implementation • 21 Sep 2023 • Jianing Yang, Xuweiyi Chen, Shengyi Qian, Nikhil Madaan, Madhavan Iyengar, David F. Fouhey, Joyce Chai
While existing approaches often rely on extensive labeled data or exhibit limitations in handling complex language queries, we propose LLM-Grounder, a novel zero-shot, open-vocabulary, Large Language Model (LLM)-based 3D visual grounding pipeline.
no code implementations • CVPR 2023 • Nilesh Kulkarni, Linyi Jin, Justin Johnson, David F. Fouhey
We introduce a method that can learn to predict scene-level implicit functions for 3D reconstruction from posed RGBD data.
1 code implementation • ICCV 2023 • Shengyi Qian, David F. Fouhey
Humans can easily understand a single image as depicting multiple potential objects permitting interaction.
no code implementations • CVPR 2023 • Richard E. L. Higgins, David F. Fouhey
We present a method that uses manipulation to learn to understand the objects people hold and as well as hand-object contact.
1 code implementation • CVPR 2023 • Linyi Jin, Jianming Zhang, Yannick Hold-Geoffroy, Oliver Wang, Kevin Matzen, Matthew Sticha, David F. Fouhey
We propose perspective fields as a representation that models the local perspective properties of an image.
no code implementations • 29 Sep 2022 • David F. Fouhey, Richard E. L. Higgins, Spiro K. Antiochos, Graham Barnes, Marc L. DeRosa, J. Todd Hoeksema, K. D. Leka, Yang Liu, Peter W. Schuck, Tamas I. Gombosi
Second, analysis of over 12, 000 scans show that the pointing information is often incorrect by dozens of arcseconds with a strong bias.
no code implementations • 18 Aug 2022 • Chris Rockwell, Justin Johnson, David F. Fouhey
We present a simple baseline for directly estimating the relative pose (rotation and translation, including scale) between two images.
1 code implementation • 8 Aug 2022 • Samir Agarwala, Linyi Jin, Chris Rockwell, David F. Fouhey
We present an approach for the planar surface reconstruction of a scene from images with limited overlap.
1 code implementation • 26 Apr 2022 • Ziyang Chen, David F. Fouhey, Andrew Owens
We adapt the contrastive random walk of Jabri et al. to learn a cycle-consistent representation from unlabeled stereo sounds, resulting in a model that performs on par with supervised methods on "in the wild" internet recordings.
no code implementations • CVPR 2022 • Shengyi Qian, Linyi Jin, Chris Rockwell, Siyi Chen, David F. Fouhey
We propose to investigate detecting and characterizing the 3D planar articulation of objects from ordinary videos.
no code implementations • 8 Dec 2021 • Nilesh Kulkarni, Justin Johnson, David F. Fouhey
We present an approach for full 3D scene reconstruction from a single unseen image.
no code implementations • 2 Dec 2021 • Shengyi Qian, Alexander Kirillov, Nikhila Ravi, Devendra Singh Chaplot, Justin Johnson, David F. Fouhey, Georgia Gkioxari
Humans can perceive scenes in 3D from a handful of 2D views.
no code implementations • 27 Aug 2021 • Richard E. L. Higgins, David F. Fouhey, Spiro K. Antiochos, Graham Barnes, Mark C. M. Cheung, J. Todd Hoeksema, KD Leka, Yang Liu, Peter W. Schuck, Tamas I. Gombosi
Both NASA's Solar Dynamics Observatory (SDO) and the JAXA/NASA Hinode mission include spectropolarimetric instruments designed to measure the photospheric magnetic field.
1 code implementation • ICCV 2021 • Chris Rockwell, David F. Fouhey, Justin Johnson
Recent advancements in differentiable rendering and 3D reasoning have driven exciting results in novel view synthesis from a single image.
no code implementations • 3 May 2021 • Alexander Raistrick, Nilesh Kulkarni, David F. Fouhey
At the heart of our approach is the idea of collision replay, where we use examples of a collision to provide supervision for observations at a past frame.
1 code implementation • 31 Mar 2021 • Richard E. L. Higgins, David F. Fouhey, Dichang Zhang, Spiro K. Antiochos, Graham Barnes, J. Todd Hoeksema, K. D. Leka, Yang Liu, Peter W. Schuck, Tamas I. Gombosi
The Helioseismic and Magnetic Imager (HMI) onboard NASA's Solar Dynamics Observatory (SDO) produces estimates of the photospheric magnetic field which are a critical input to many space weather modelling and forecasting systems.
1 code implementation • ICCV 2021 • Linyi Jin, Shengyi Qian, Andrew Owens, David F. Fouhey
The paper studies planar surface reconstruction of indoor scenes from two views with unknown camera poses.
no code implementations • ECCV 2020 • Chris Rockwell, David F. Fouhey
There has been great progress in human 3D mesh recovery and great interest in learning about the world from consumer video data.
1 code implementation • ECCV 2020 • Shengyi Qian, Linyi Jin, David F. Fouhey
This information is then jointly reasoned over to produce the most likely explanation of the scene.
1 code implementation • CVPR 2020 • Dandan Shan, Jiaqi Geng, Michelle Shu, David F. Fouhey
Hands are the central means by which humans manipulate their world and being able to reliably extract hand state information from Internet videos of humans engaged in their hands has the potential to pave the way to systems that can learn from petabytes of video data.
1 code implementation • CVPR 2020 • Mohamed El Banani, Jason J. Corso, David F. Fouhey
Our key insight is that although we do not have an explicit 3D model or a predefined canonical pose, we can still learn to estimate the object's shape in the viewer's frame and then use an image to provide our reference model or canonical pose.
1 code implementation • CVPR 2020 • Nilesh Kulkarni, Abhinav Gupta, David F. Fouhey, Shubham Tulsiani
We tackle the tasks of: 1) predicting a Canonical Surface Mapping (CSM) that indicates the mapping from 2D pixels to corresponding points on a canonical template shape, and 2) inferring the articulation and pose of the template corresponding to the input image.
no code implementations • 11 Mar 2019 • Richard Galvez, David F. Fouhey, Meng Jin, Alexandre Szenicer, Andrés Muñoz-Jaramillo, Mark C. M. Cheung, Paul J. Wright, Monica G. Bobra, Yang Liu, James Mason, Rajat Thomas
In this paper we present a curated dataset from the NASA Solar Dynamics Observatory (SDO) mission in a format suitable for machine learning research.
no code implementations • CVPR 2018 • David F. Fouhey, Wei-cheng Kuo, Alexei A. Efros, Jitendra Malik
A major stumbling block to progress in understanding basic human interactions, such as getting out of bed or opening a refrigerator, is lack of good training data.
no code implementations • 20 Dec 2016 • David F. Fouhey, Abhinav Gupta, Andrew Zisserman
Our first objective is to infer these 3D shape attributes from a single image.
no code implementations • CVPR 2016 • David F. Fouhey, Abhinav Gupta, Andrew Zisserman
In this paper we investigate 3D attributes as a means to understand the shape of an object in a single image.
2 code implementations • 29 Mar 2016 • Rohit Girdhar, David F. Fouhey, Mikel Rodriguez, Abhinav Gupta
The network consists of two components: (a) an autoencoder that ensures the representation is generative; and (b) a convolutional network that ensures the representation is predictable.
no code implementations • ICCV 2015 • David F. Fouhey, Wajahat Hussain, Abhinav Gupta, Martial Hebert
Do we really need 3D labels in order to learn how to predict 3D?
no code implementations • 5 May 2015 • David F. Fouhey, Xiaolong Wang, Abhinav Gupta
The field of functional recognition or affordance estimation from images has seen a revival in recent years.
no code implementations • CVPR 2015 • Xiaolong Wang, David F. Fouhey, Abhinav Gupta
We show by incorporating several constraints (man-made, manhattan world) and meaningful intermediate representations (room layout, edge labels) in the architecture leads to state of the art performance on surface normal estimation.
no code implementations • CVPR 2014 • David F. Fouhey, C. L. Zitnick
Given a static scene, a human can trivially enumerate the myriad of things that can happen next and characterize the relative likelihood of each.