Search Results for author: Zhoutong Zhang

Found 18 papers, 7 papers with code

FeatUp: A Model-Agnostic Framework for Features at Any Resolution

1 code implementation • 15 Mar 2024 • Stephanie Fu, Mark Hamilton, Laura Brandt, Axel Feldman, Zhoutong Zhang, William T. Freeman

Deep features are a cornerstone of computer vision research, capturing image semantics and enabling the community to solve downstream tasks even in the zero- or few-shot regime.

Ranked #1 on Feature Upsampling on ImageNet

Depth Estimation Depth Prediction +5

1,006

Paper
Code

DriveTrack: A Benchmark for Long-Range Point Tracking in Real-World Videos

no code implementations • 15 Dec 2023 • Arjun Balasingam, Joseph Chandler, Chenning Li, Zhoutong Zhang, Hari Balakrishnan

Second, we analyze the sensitivity of trackers to visual artifacts in real scenes and motivate the idea of running assistive keypoint selectors alongside trackers.

Autonomous Driving Point Tracking

Paper
Add Code

Fast View Synthesis of Casual Videos

no code implementations • 4 Dec 2023 • Yao-Chih Lee, Zhoutong Zhang, Kevin Blackburn-Matzen, Simon Niklaus, Jianming Zhang, Jia-Bin Huang, Feng Liu

Specifically, we build a global static scene model using an extended plane-based scene representation to synthesize temporally coherent novel video.

Novel View Synthesis

Paper
Add Code

Unsupervised Semantic Segmentation by Distilling Feature Correspondences

3 code implementations • ICLR 2022 • Mark Hamilton, Zhoutong Zhang, Bharath Hariharan, Noah Snavely, William T. Freeman

Unsupervised semantic segmentation aims to discover and localize semantically meaningful categories within image corpora without any form of annotation.

Ranked #4 on Unsupervised Semantic Segmentation on Potsdam-3

Unsupervised Semantic Segmentation

686

Paper
Code

Differentiable Surface Rendering via Non-Differentiable Sampling

no code implementations • ICCV 2021 • Forrester Cole, Kyle Genova, Avneesh Sud, Daniel Vlasic, Zhoutong Zhang

We present a method for differentiable rendering of 3D surfaces that supports both explicit and implicit representations, provides derivatives at occlusion boundaries, and is fast and simple to implement.

Inverse Rendering

Paper
Add Code

Consistent Depth of Moving Objects in Video

no code implementations • 2 Aug 2021 • Zhoutong Zhang, Forrester Cole, Richard Tucker, William T. Freeman, Tali Dekel

We present a method to estimate depth of a dynamic scene, containing arbitrary moving objects, from an ordinary video captured with a moving camera.

Depth Estimation Depth Prediction +2

Paper
Add Code

Editing Conditional Radiance Fields

1 code implementation • ICCV 2021 • Steven Liu, Xiuming Zhang, Zhoutong Zhang, Richard Zhang, Jun-Yan Zhu, Bryan Russell

In this paper, we explore enabling user editing of a category-level NeRF - also known as a conditional radiance field - trained on a shape category.

Ranked #1 on Novel View Synthesis on PhotoShape

Novel View Synthesis

253

Paper
Code

End-to-End Optimization of Scene Layout

1 code implementation • CVPR 2020 • Andrew Luo, Zhoutong Zhang, Jiajun Wu, Joshua B. Tenenbaum

Experiments suggest that our model achieves higher accuracy and diversity in conditional scene synthesis and allows exemplar-based scene generation from various input forms.

Indoor Scene Reconstruction Indoor Scene Synthesis +2

Paper
Code

Deep Audio Priors Emerge From Harmonic Convolutional Networks

no code implementations • ICLR 2020 • Zhoutong Zhang, Yunyun Wang, Chuang Gan, Jiajun Wu, Joshua B. Tenenbaum, Antonio Torralba, William T. Freeman

We show that networks using Harmonic Convolution can reliably model audio priors and achieve high performance in unsupervised audio restoration tasks.

Paper
Add Code

Learning to Reconstruct Shapes from Unseen Classes

no code implementations • NeurIPS 2018 • Xiuming Zhang, Zhoutong Zhang, Chengkai Zhang, Joshua B. Tenenbaum, William T. Freeman, Jiajun Wu

From a single image, humans are able to perceive the full 3D shape of an object by exploiting learned shape priors from everyday life.

3D Reconstruction

Paper
Add Code

Visual Object Networks: Image Generation with Disentangled 3D Representation

1 code implementation • NeurIPS 2018 • Jun-Yan Zhu, Zhoutong Zhang, Chengkai Zhang, Jiajun Wu, Antonio Torralba, Joshua B. Tenenbaum, William T. Freeman

Our model first learns to synthesize 3D shapes that are indistinguishable from real shapes.

Image Generation Object

532

Paper
Code

Visual Object Networks: Image Generation with Disentangled 3D Representations

1 code implementation • NeurIPS 2018 • Jun-Yan Zhu, Zhoutong Zhang, Chengkai Zhang, Jiajun Wu, Antonio Torralba, Josh Tenenbaum, Bill Freeman

The VON not only generates images that are more realistic than the state-of-the-art 2D image synthesis methods but also enables many 3D operations such as changing the viewpoint of a generated image, shape and texture editing, linear interpolation in texture and shape space, and transferring appearance across different objects and viewpoints.

Image Generation Object

532

Paper
Code

Learning Shape Priors for Single-View 3D Completion and Reconstruction

no code implementations • ECCV 2018 • Jiajun Wu, Chengkai Zhang, Xiuming Zhang, Zhoutong Zhang, William T. Freeman, Joshua B. Tenenbaum

The problem of single-view 3D shape completion or reconstruction is challenging, because among the many possible shapes that explain an observation, most are implausible and do not correspond to natural objects.

Paper
Add Code

Seeing Tree Structure from Vibration

no code implementations • ECCV 2018 • Tianfan Xue, Jiajun Wu, Zhoutong Zhang, Chengkai Zhang, Joshua B. Tenenbaum, William T. Freeman

Humans recognize object structure from both their appearance and motion; often, motion helps to resolve ambiguities in object structure that arise when we observe object appearance only.

Bayesian Inference Object

Paper
Add Code

Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling

1 code implementation • CVPR 2018 • Xingyuan Sun, Jiajun Wu, Xiuming Zhang, Zhoutong Zhang, Chengkai Zhang, Tianfan Xue, Joshua B. Tenenbaum, William T. Freeman

We study 3D shape modeling from a single image and make contributions to it in three aspects.

Ranked #1 on 3D Shape Classification on Pix3D

3D Reconstruction 3D Shape Modeling +5

485

Paper
Code

Shape and Material from Sound

no code implementations • NeurIPS 2017 • Zhoutong Zhang, Qiujia Li, Zhengjia Huang, Jiajun Wu, Josh Tenenbaum, Bill Freeman

Hearing an object falling onto the ground, humans can recover rich information including its rough shape, material, and falling height.

Object

Paper
Add Code

Generative Modeling of Audible Shapes for Object Perception

no code implementations • ICCV 2017 • Zhoutong Zhang, Jiajun Wu, Qiujia Li, Zhengjia Huang, James Traer, Josh H. McDermott, Joshua B. Tenenbaum, William T. Freeman

Humans infer rich knowledge of objects from both auditory and visual cues.

Object

Paper
Add Code

Light Field From Micro-Baseline Image Pair

no code implementations • CVPR 2015 • Zhoutong Zhang, Yebin Liu, Qionghai Dai

We first introduce a disparity assisted phase based synthesis (DAPS) strategy that can integrate disparity infor- mation into the phase term of a reference image to warp it to its close neighbor views.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.