1 code implementation • 15 Mar 2024 • Stephanie Fu, Mark Hamilton, Laura Brandt, Axel Feldman, Zhoutong Zhang, William T. Freeman
Deep features are a cornerstone of computer vision research, capturing image semantics and enabling the community to solve downstream tasks even in the zero- or few-shot regime.
Ranked #1 on Feature Upsampling on ImageNet
no code implementations • 15 Dec 2023 • Arjun Balasingam, Joseph Chandler, Chenning Li, Zhoutong Zhang, Hari Balakrishnan
Second, we analyze the sensitivity of trackers to visual artifacts in real scenes and motivate the idea of running assistive keypoint selectors alongside trackers.
no code implementations • 4 Dec 2023 • Yao-Chih Lee, Zhoutong Zhang, Kevin Blackburn-Matzen, Simon Niklaus, Jianming Zhang, Jia-Bin Huang, Feng Liu
Specifically, we build a global static scene model using an extended plane-based scene representation to synthesize temporally coherent novel video.
3 code implementations • ICLR 2022 • Mark Hamilton, Zhoutong Zhang, Bharath Hariharan, Noah Snavely, William T. Freeman
Unsupervised semantic segmentation aims to discover and localize semantically meaningful categories within image corpora without any form of annotation.
Ranked #4 on Unsupervised Semantic Segmentation on Potsdam-3
no code implementations • ICCV 2021 • Forrester Cole, Kyle Genova, Avneesh Sud, Daniel Vlasic, Zhoutong Zhang
We present a method for differentiable rendering of 3D surfaces that supports both explicit and implicit representations, provides derivatives at occlusion boundaries, and is fast and simple to implement.
no code implementations • 2 Aug 2021 • Zhoutong Zhang, Forrester Cole, Richard Tucker, William T. Freeman, Tali Dekel
We present a method to estimate depth of a dynamic scene, containing arbitrary moving objects, from an ordinary video captured with a moving camera.
1 code implementation • ICCV 2021 • Steven Liu, Xiuming Zhang, Zhoutong Zhang, Richard Zhang, Jun-Yan Zhu, Bryan Russell
In this paper, we explore enabling user editing of a category-level NeRF - also known as a conditional radiance field - trained on a shape category.
Ranked #1 on Novel View Synthesis on PhotoShape
1 code implementation • CVPR 2020 • Andrew Luo, Zhoutong Zhang, Jiajun Wu, Joshua B. Tenenbaum
Experiments suggest that our model achieves higher accuracy and diversity in conditional scene synthesis and allows exemplar-based scene generation from various input forms.
no code implementations • ICLR 2020 • Zhoutong Zhang, Yunyun Wang, Chuang Gan, Jiajun Wu, Joshua B. Tenenbaum, Antonio Torralba, William T. Freeman
We show that networks using Harmonic Convolution can reliably model audio priors and achieve high performance in unsupervised audio restoration tasks.
no code implementations • NeurIPS 2018 • Xiuming Zhang, Zhoutong Zhang, Chengkai Zhang, Joshua B. Tenenbaum, William T. Freeman, Jiajun Wu
From a single image, humans are able to perceive the full 3D shape of an object by exploiting learned shape priors from everyday life.
1 code implementation • NeurIPS 2018 • Jun-Yan Zhu, Zhoutong Zhang, Chengkai Zhang, Jiajun Wu, Antonio Torralba, Joshua B. Tenenbaum, William T. Freeman
Our model first learns to synthesize 3D shapes that are indistinguishable from real shapes.
1 code implementation • NeurIPS 2018 • Jun-Yan Zhu, Zhoutong Zhang, Chengkai Zhang, Jiajun Wu, Antonio Torralba, Josh Tenenbaum, Bill Freeman
The VON not only generates images that are more realistic than the state-of-the-art 2D image synthesis methods but also enables many 3D operations such as changing the viewpoint of a generated image, shape and texture editing, linear interpolation in texture and shape space, and transferring appearance across different objects and viewpoints.
no code implementations • ECCV 2018 • Jiajun Wu, Chengkai Zhang, Xiuming Zhang, Zhoutong Zhang, William T. Freeman, Joshua B. Tenenbaum
The problem of single-view 3D shape completion or reconstruction is challenging, because among the many possible shapes that explain an observation, most are implausible and do not correspond to natural objects.
no code implementations • ECCV 2018 • Tianfan Xue, Jiajun Wu, Zhoutong Zhang, Chengkai Zhang, Joshua B. Tenenbaum, William T. Freeman
Humans recognize object structure from both their appearance and motion; often, motion helps to resolve ambiguities in object structure that arise when we observe object appearance only.
1 code implementation • CVPR 2018 • Xingyuan Sun, Jiajun Wu, Xiuming Zhang, Zhoutong Zhang, Chengkai Zhang, Tianfan Xue, Joshua B. Tenenbaum, William T. Freeman
We study 3D shape modeling from a single image and make contributions to it in three aspects.
Ranked #1 on 3D Shape Classification on Pix3D
no code implementations • NeurIPS 2017 • Zhoutong Zhang, Qiujia Li, Zhengjia Huang, Jiajun Wu, Josh Tenenbaum, Bill Freeman
Hearing an object falling onto the ground, humans can recover rich information including its rough shape, material, and falling height.
no code implementations • ICCV 2017 • Zhoutong Zhang, Jiajun Wu, Qiujia Li, Zhengjia Huang, James Traer, Josh H. McDermott, Joshua B. Tenenbaum, William T. Freeman
Humans infer rich knowledge of objects from both auditory and visual cues.
no code implementations • CVPR 2015 • Zhoutong Zhang, Yebin Liu, Qionghai Dai
We first introduce a disparity assisted phase based synthesis (DAPS) strategy that can integrate disparity infor- mation into the phase term of a reference image to warp it to its close neighbor views.