Search Results for author: William T. Freeman

Found 106 papers, 35 papers with code

A Compositional Model for Low-Dimensional Image Set Representation

no code implementations CVPR 2014 Hossein Mobahi, Ce Liu, William T. Freeman

Learning a low-dimensional representation of images is useful for various applications in graphics and computer vision.

Seeing the Arrow of Time

no code implementations CVPR 2014 Lyndsey C. Pickup, Zheng Pan, Donglai Wei, YiChang Shih, Chang-Shui Zhang, Andrew Zisserman, Bernhard Scholkopf, William T. Freeman

We explore whether we can observe Time's Arrow in a temporal sequence--is it possible to tell whether a video is running forwards or backwards?

General Classification Video Compression

Reflection Removal Using Ghosting Cues

no code implementations CVPR 2015 YiChang Shih, Dilip Krishnan, Fredo Durand, William T. Freeman

For single-pane windows, ghosting cues arise from shifted reflections on the two surfaces of the glass pane.

Reflection Removal

Visual Vibrometry: Estimating Material Properties From Small Motion in Video

no code implementations CVPR 2015 Abe Davis, Katherine L. Bouman, Justin G. Chen, Michael Rubinstein, Fredo Durand, William T. Freeman

The estimation of material properties is important for scene understanding, with many applications in vision, robotics, and structural engineering.

Scene Understanding

Learning Ordinal Relationships for Mid-Level Vision

no code implementations ICCV 2015 Daniel Zoran, Phillip Isola, Dilip Krishnan, William T. Freeman

We demonstrate that this frame- work works well on two important mid-level vision tasks: intrinsic image decomposition and depth from an RGB im- age.

Depth Estimation Intrinsic Image Decomposition

Computational Imaging for VLBI Image Reconstruction

no code implementations CVPR 2016 Katherine L. Bouman, Michael D. Johnson, Daniel Zoran, Vincent L. Fish, Sheperd S. Doeleman, William T. Freeman

Very long baseline interferometry (VLBI) is a technique for imaging celestial radio emissions by simultaneously observing a source from telescopes distributed across Earth.

Image Reconstruction

Single Image 3D Interpreter Network

1 code implementation29 Apr 2016 Jiajun Wu, Tianfan Xue, Joseph J. Lim, Yuandong Tian, Joshua B. Tenenbaum, Antonio Torralba, William T. Freeman

In this work, we propose 3D INterpreter Network (3D-INN), an end-to-end framework which sequentially estimates 2D keypoint heatmaps and 3D object structure, trained on both real 2D-annotated images and synthetic 3D data.

Image Retrieval Keypoint Estimation +2

Ambient Sound Provides Supervision for Visual Learning

1 code implementation25 Aug 2016 Andrew Owens, Jiajun Wu, Josh H. McDermott, William T. Freeman, Antonio Torralba

We show that, through this process, the network learns a representation that conveys information about objects and scenes.

Object Recognition

Synthesizing Normalized Faces from Facial Identity Features

1 code implementation CVPR 2017 Forrester Cole, David Belanger, Dilip Krishnan, Aaron Sarna, Inbar Mosseri, William T. Freeman

We present a method for synthesizing a frontal, neutral-expression image of a person's face given an input face photograph.

On the Effectiveness of Visible Watermarks

no code implementations CVPR 2017 Tali Dekel, Michael Rubinstein, Ce Liu, William T. Freeman

Since such an attack relies on the consistency of watermarks across image collection, we explore and evaluate how it is affected by various types of inconsistencies in the watermark embedding that could potentially be used to make watermarking more secured.

Image Matting

Turning Corners Into Cameras: Principles and Methods

no code implementations ICCV 2017 Katherine L. Bouman, Vickie Ye, Adam B. Yedidia, Fredo Durand, Gregory W. Wornell, Antonio Torralba, William T. Freeman

We show that walls and other obstructions with edges can be exploited as naturally-occurring "cameras" that reveal the hidden scenes beyond them.

Reconstructing Video from Interferometric Measurements of Time-Varying Sources

1 code implementation3 Nov 2017 Katherine L. Bouman, Michael D. Johnson, Adrian V. Dalca, Andrew A. Chael, Freek Roelofs, Sheperd S. Doeleman, William T. Freeman

Most recently, the Event Horizon Telescope (EHT) has extended VLBI to short millimeter wavelengths with a goal of achieving angular resolution sufficient for imaging the event horizons of nearby supermassive black holes.

Image Imputation Radio Interferometry

MarrNet: 3D Shape Reconstruction via 2.5D Sketches

no code implementations NeurIPS 2017 Jiajun Wu, Yifan Wang, Tianfan Xue, Xingyuan Sun, William T. Freeman, Joshua B. Tenenbaum

First, compared to full 3D shape, 2. 5D sketches are much easier to be recovered from a 2D image; models that recover 2. 5D sketches are also more likely to transfer from synthetic to real data.

3D Object Reconstruction From A Single Image 3D Reconstruction +3

Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning

no code implementations20 Dec 2017 Andrew Owens, Jiajun Wu, Josh H. McDermott, William T. Freeman, Antonio Torralba

The sound of crashing waves, the roar of fast-moving cars -- sound conveys important information about the objects in our surroundings.

3D Interpreter Networks for Viewer-Centered Wireframe Modeling

no code implementations3 Apr 2018 Jiajun Wu, Tianfan Xue, Joseph J. Lim, Yuandong Tian, Joshua B. Tenenbaum, Antonio Torralba, William T. Freeman

3D-INN is trained on real images to estimate 2D keypoint heatmaps from an input image; it then predicts 3D object structure from heatmaps using knowledge learned from synthetic 3D shapes.

Image Retrieval Keypoint Estimation +2

Learning-based Video Motion Magnification

2 code implementations ECCV 2018 Tae-Hyun Oh, Ronnachai Jaroensri, Changil Kim, Mohamed Elgharib, Frédo Durand, William T. Freeman, Wojciech Matusik

We show that the learned filters achieve high-quality results on real videos, with less ringing artifacts and better noise characteristics than previous methods.

Motion Magnification

Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation

5 code implementations10 Apr 2018 Ariel Ephrat, Inbar Mosseri, Oran Lang, Tali Dekel, Kevin Wilson, Avinatan Hassidim, William T. Freeman, Michael Rubinstein

Solving this task using only audio as input is extremely challenging and does not provide an association of the separated speech signals with speakers in the video.

Speech Separation

Inferring Light Fields From Shadows

1 code implementation CVPR 2018 Manel Baradad, Vickie Ye, Adam B. Yedidia, Frédo Durand, William T. Freeman, Gregory W. Wornell, Antonio Torralba

We present a method for inferring a 4D light field of a hidden scene from 2D shadows cast by a known occluder on a diffuse wall.

Unsupervised Training for 3D Morphable Model Regression

2 code implementations CVPR 2018 Kyle Genova, Forrester Cole, Aaron Maschinot, Aaron Sarna, Daniel Vlasic, William T. Freeman

We train a regression network using these objectives, a set of unlabeled photographs, and the morphable model itself, and demonstrate state-of-the-art results.

Ranked #2 on 3D Face Reconstruction on Florence (Average 3D Error metric)

3D Face Reconstruction regression

Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks

no code implementations24 Jul 2018 Tianfan Xue, Jiajun Wu, Katherine L. Bouman, William T. Freeman

We study the problem of synthesizing a number of likely future frames from a single input image.

Medical Image Imputation from Image Collections

2 code implementations17 Aug 2018 Adrian V. Dalca, Katherine L. Bouman, William T. Freeman, Natalia S. Rost, Mert R. Sabuncu, Polina Golland

We present an algorithm for creating high resolution anatomically plausible images consistent with acquired clinical brain MRI scans with large inter-slice spacing.

Anatomy Image Imputation +2

3D-Aware Scene Manipulation via Inverse Graphics

1 code implementation NeurIPS 2018 Shunyu Yao, Tzu Ming Harry Hsu, Jun-Yan Zhu, Jiajun Wu, Antonio Torralba, William T. Freeman, Joshua B. Tenenbaum

In this work, we propose 3D scene de-rendering networks (3D-SDN) to address the above issues by integrating disentangled representations for semantics, geometry, and appearance into a deep generative model.

Disentanglement Object

Seeing Tree Structure from Vibration

no code implementations ECCV 2018 Tianfan Xue, Jiajun Wu, Zhoutong Zhang, Chengkai Zhang, Joshua B. Tenenbaum, William T. Freeman

Humans recognize object structure from both their appearance and motion; often, motion helps to resolve ambiguities in object structure that arise when we observe object appearance only.

Bayesian Inference Object

Physical Primitive Decomposition

no code implementations ECCV 2018 Zhijian Liu, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu

As annotated data for object parts and physics are rare, we propose a novel formulation that learns physical primitives by explaining both an object's appearance and its behaviors in physical events.

Object

Learning Shape Priors for Single-View 3D Completion and Reconstruction

no code implementations ECCV 2018 Jiajun Wu, Chengkai Zhang, Xiuming Zhang, Zhoutong Zhang, William T. Freeman, Joshua B. Tenenbaum

The problem of single-view 3D shape completion or reconstruction is challenging, because among the many possible shapes that explain an observation, most are implausible and do not correspond to natural objects.

MoSculp: Interactive Visualization of Shape and Time

no code implementations14 Sep 2018 Xiuming Zhang, Tali Dekel, Tianfan Xue, Andrew Owens, Qiurui He, Jiajun Wu, Stefanie Mueller, William T. Freeman

We present a system that allows users to visualize complex human motion via 3D motion sculptures---a representation that conveys the 3D structure swept by a human body as it moves through space.

ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics

no code implementations2 Oct 2018 Yuanming Hu, Jian-Cheng Liu, Andrew Spielberg, Joshua B. Tenenbaum, William T. Freeman, Jiajun Wu, Daniela Rus, Wojciech Matusik

The underlying physical laws of deformable objects are more complex, and the resulting systems have orders of magnitude more degrees of freedom and therefore they are significantly more computationally expensive to simulate.

Motion Planning

Co-regularized Alignment for Unsupervised Domain Adaptation

no code implementations NeurIPS 2018 Abhishek Kumar, Prasanna Sattigeri, Kahini Wadhawan, Leonid Karlinsky, Rogerio Feris, William T. Freeman, Gregory Wornell

Deep neural networks, trained with large amount of labeled data, can fail to generalize well when tested with examples from a \emph{target domain} whose distribution differs from the training data distribution, referred as the \emph{source domain}.

Unsupervised Domain Adaptation

Learning to Reconstruct Shapes from Unseen Classes

no code implementations NeurIPS 2018 Xiuming Zhang, Zhoutong Zhang, Chengkai Zhang, Joshua B. Tenenbaum, William T. Freeman, Jiajun Wu

From a single image, humans are able to perceive the full 3D shape of an object by exploiting learned shape priors from everyday life.

3D Reconstruction

Learning to Infer and Execute 3D Shape Programs

no code implementations ICLR 2019 Yonglong Tian, Andrew Luo, Xingyuan Sun, Kevin Ellis, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu

Human perception of 3D shapes goes beyond reconstructing them as a set of points or a composition of geometric primitives: we also effortlessly understand higher-level shape structure such as the repetition and reflective symmetry of object parts.

On the Units of GANs (Extended Abstract)

no code implementations29 Jan 2019 David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba

We quantify the causal effect of interpretable units by measuring the ability of interventions to control objects in the output.

Unsupervised Discovery of Parts, Structure, and Dynamics

no code implementations12 Mar 2019 Zhenjia Xu, Zhijian Liu, Chen Sun, Kevin Murphy, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu

Humans easily recognize object parts and their hierarchical structure by watching how they move; they can then predict how each part moves in the future.

Object

Learning Shape Templates with Structured Implicit Functions

1 code implementation ICCV 2019 Kyle Genova, Forrester Cole, Daniel Vlasic, Aaron Sarna, William T. Freeman, Thomas Funkhouser

To allow for widely varying geometry and topology, we choose an implicit surface representation based on composition of local shape elements.

Semantic Segmentation

Learning to Describe Scenes with Programs

no code implementations ICLR 2019 Yunchao Liu, Zheng Wu, Daniel Ritchie, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu

We are able to understand the higher-level, abstract regularities within the scene such as symmetry and repetition.

Modeling Parts, Structure, and System Dynamics via Predictive Learning

no code implementations ICLR 2019 Zhenjia Xu, Zhijian Liu, Chen Sun, Kevin Murphy, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu

Humans easily recognize object parts and their hierarchical structure by watching how they move; they can then predict how each part moves in the future.

Object

Program-Guided Image Manipulators

no code implementations ICCV 2019 Jiayuan Mao, Xiuming Zhang, Yikai Li, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu

Humans are capable of building holistic representations for images at various levels, from local objects, to pairwise relations, to global structures.

Image Inpainting

Computational Mirrors: Blind Inverse Light Transport by Deep Matrix Factorization

1 code implementation NeurIPS 2019 Miika Aittala, Prafull Sharma, Lukas Murmann, Adam B. Yedidia, Gregory W. Wornell, William T. Freeman, Fredo Durand

We recover a video of the motion taking place in a hidden scene by observing changes in indirect illumination in a nearby uncalibrated visible region.

Semantic Pyramid for Image Generation

2 code implementations CVPR 2020 Assaf Shocher, Yossi Gandelsman, Inbar Mosseri, Michal Yarom, Michal Irani, William T. Freeman, Tali Dekel

We demonstrate that our model results in a versatile and flexible framework that can be used in various classic and novel image generation tasks.

General Classification Image Generation +2

SpeedNet: Learning the Speediness in Videos

1 code implementation CVPR 2020 Sagie Benaim, Ariel Ephrat, Oran Lang, Inbar Mosseri, William T. Freeman, Michael Rubinstein, Michal Irani, Tali Dekel

We demonstrate how those learned features can boost the performance of self-supervised action recognition, and can be used for video retrieval.

Binary Classification Retrieval +2

Deep Audio Priors Emerge From Harmonic Convolutional Networks

no code implementations ICLR 2020 Zhoutong Zhang, Yunyun Wang, Chuang Gan, Jiajun Wu, Joshua B. Tenenbaum, Antonio Torralba, William T. Freeman

We show that networks using Harmonic Convolution can reliably model audio priors and achieve high performance in unsupervised audio restoration tasks.

Two-Dimensional Non-Line-of-Sight Scene Estimation from a Single Edge Occluder

no code implementations16 Jun 2020 Sheila W. Seidel, John Murray-Bruce, Yanting Ma, Christopher Yu, William T. Freeman, Vivek K Goyal

Previous work has leveraged the vertical nature of the edge to demonstrate 1D (in angle measured around the corner) reconstructions of moving and stationary hidden scenery from as little as a single photograph of the penumbra.

It Is Likely That Your Loss Should be a Likelihood

no code implementations12 Jul 2020 Mark Hamilton, Evan Shelhamer, William T. Freeman

Joint optimization of these "likelihood parameters" with model parameters can adaptively tune the scales and shapes of losses in addition to the strength of regularization.

Outlier Detection

Neural Light Transport for Relighting and View Synthesis

1 code implementation9 Aug 2020 Xiuming Zhang, Sean Fanello, Yun-Ta Tsai, Tiancheng Sun, Tianfan Xue, Rohit Pandey, Sergio Orts-Escolano, Philip Davidson, Christoph Rhemann, Paul Debevec, Jonathan T. Barron, Ravi Ramamoorthi, William T. Freeman

In particular, we show how to fuse previously seen observations of illuminants and views to synthesize a new image of the same scene under a desired lighting condition from a chosen viewpoint.

Layered Neural Rendering for Retiming People in Video

1 code implementation16 Sep 2020 Erika Lu, Forrester Cole, Tali Dekel, Weidi Xie, Andrew Zisserman, David Salesin, William T. Freeman, Michael Rubinstein

We present a method for retiming people in an ordinary, natural video -- manipulating and editing the time in which different motions of individuals in the video occur.

Neural Rendering

Large-Scale Intelligent Microservices

1 code implementation17 Sep 2020 Mark Hamilton, Nick Gonsalves, Christina Lee, Anand Raman, Brendan Walsh, Siddhartha Prasad, Dalitso Banda, Lucy Zhang, Mei Gao, Lei Zhang, William T. Freeman

Deploying Machine Learning (ML) algorithms within databases is a challenge due to the varied computational footprints of modern ML algorithms and the myriad of database technologies each with its own restrictive syntax.

Anomaly Detection

Multi-Plane Program Induction with 3D Box Priors

no code implementations NeurIPS 2020 Yikai Li, Jiayuan Mao, Xiuming Zhang, William T. Freeman, Joshua B. Tenenbaum, Noah Snavely, Jiajun Wu

We consider two important aspects in understanding and editing images: modeling regular, program-like texture or patterns in 2D planes, and 3D posing of these planes in the scene.

Program induction Program Synthesis

AutoFlow: Learning a Better Training Set for Optical Flow

1 code implementation CVPR 2021 Deqing Sun, Daniel Vlasic, Charles Herrmann, Varun Jampani, Michael Krainin, Huiwen Chang, Ramin Zabih, William T. Freeman, Ce Liu

Synthetic datasets play a critical role in pre-training CNN models for optical flow, but they are painstaking to generate and hard to adapt to new applications.

Optical Flow Estimation

Omnimatte: Associating Objects and Their Effects in Video

no code implementations CVPR 2021 Erika Lu, Forrester Cole, Tali Dekel, Andrew Zisserman, William T. Freeman, Michael Rubinstein

We show results on real-world videos containing interactions between different types of subjects (cars, animals, people) and complex effects, ranging from semi-transparent elements such as smoke and reflections, to fully opaque effects such as objects attached to the subject.

Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering

1 code implementation NeurIPS 2021 Vincent Sitzmann, Semon Rezchikov, William T. Freeman, Joshua B. Tenenbaum, Fredo Durand

In this work, we propose a novel neural scene representation, Light Field Networks or LFNs, which represent both geometry and appearance of the underlying 3D scene in a 360-degree, four-dimensional light field parameterized via a neural implicit representation.

Meta-Learning Scene Understanding

Toward Automatic Interpretation of 3D Plots

no code implementations14 Jun 2021 Laura E. Brandt, William T. Freeman

This paper explores the challenge of teaching a machine how to reverse-engineer the grid-marked surfaces used to represent data in 3D surface plots of two-variable functions.

Document Image Classification Shape from Texture

Consistent Depth of Moving Objects in Video

no code implementations2 Aug 2021 Zhoutong Zhang, Forrester Cole, Richard Tucker, William T. Freeman, Tali Dekel

We present a method to estimate depth of a dynamic scene, containing arbitrary moving objects, from an ordinary video captured with a moving camera.

Depth Estimation Depth Prediction +2

What You Can Learn by Staring at a Blank Wall

no code implementations ICCV 2021 Prafull Sharma, Miika Aittala, Yoav Y. Schechner, Antonio Torralba, Gregory W. Wornell, William T. Freeman, Fredo Durand

We present a passive non-line-of-sight method that infers the number of people or activity of a person from the observation of a blank wall in an unknown room.

SLIDE: Single Image 3D Photography with Soft Layering and Depth-aware Inpainting

no code implementations ICCV 2021 Varun Jampani, Huiwen Chang, Kyle Sargent, Abhishek Kar, Richard Tucker, Michael Krainin, Dominik Kaeser, William T. Freeman, David Salesin, Brian Curless, Ce Liu

We present SLIDE, a modular and unified system for single image 3D photography that uses a simple yet effective soft layering strategy to better preserve appearance details in novel views.

Image Matting

HSPACE: Synthetic Parametric Humans Animated in Complex Environments

no code implementations23 Dec 2021 Eduard Gabriel Bazavan, Andrei Zanfir, Mihai Zanfir, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu

We combine a hundred diverse individuals of varying ages, gender, proportions, and ethnicity, with hundreds of motions and scenes, as well as parametric variations in body shape (for a total of 1, 600 different humans), in order to generate an initial dataset of over 1 million frames.

3D Human Pose Estimation Scene Understanding

MaskGIT: Masked Generative Image Transformer

6 code implementations CVPR 2022 Huiwen Chang, Han Zhang, Lu Jiang, Ce Liu, William T. Freeman

At inference time, the model begins with generating all tokens of an image simultaneously, and then refines the image iteratively conditioned on the previous generation.

Image Manipulation Image Outpainting +1

Disentangling Architecture and Training for Optical Flow

no code implementations21 Mar 2022 Deqing Sun, Charles Herrmann, Fitsum Reda, Michael Rubinstein, David Fleet, William T. Freeman

Our newly trained RAFT achieves an Fl-all score of 4. 31% on KITTI 2015, more accurate than all published optical flow methods at the time of writing.

Optical Flow Estimation

Neural Groundplans: Persistent Neural Scene Representations from a Single Image

no code implementations22 Jul 2022 Prafull Sharma, Ayush Tewari, Yilun Du, Sergey Zakharov, Rares Ambrus, Adrien Gaidon, William T. Freeman, Fredo Durand, Joshua B. Tenenbaum, Vincent Sitzmann

We present a method to map 2D image observations of a scene to a persistent 3D scene representation, enabling novel view synthesis and disentangled representation of the movable and immovable components of the scene.

Disentanglement Instance Segmentation +4

Can Shadows Reveal Biometric Information?

no code implementations21 Sep 2022 Safa C. Medin, Amir Weiss, Frédo Durand, William T. Freeman, Gregory W. Wornell

We transfer what we learn from the synthetic data to the real data using domain adaptation in a completely unsupervised way.

Domain Adaptation

Muse: Text-To-Image Generation via Masked Generative Transformers

4 code implementations2 Jan 2023 Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan

Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.

Language Modelling Large Language Model +1

FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention

1 code implementation17 May 2023 Guangxuan Xiao, Tianwei Yin, William T. Freeman, Frédo Durand, Song Han

FastComposer proposes delayed subject conditioning in the denoising step to maintain both identity and editability in subject-driven image generation.

Denoising Diffusion Personalization Tuning Free +1

Materialistic: Selecting Similar Materials in Images

no code implementations22 May 2023 Prafull Sharma, Julien Philip, Michaël Gharbi, William T. Freeman, Fredo Durand, Valentin Deschaintre

We present a method capable of selecting the regions of a photograph exhibiting the same material as an artist-chosen area.

Retrieval Semantic Segmentation

Background Prompting for Improved Object Depth

no code implementations8 Jun 2023 Manel Baradad, Yuanzhen Li, Forrester Cole, Michael Rubinstein, Antonio Torralba, William T. Freeman, Varun Jampani

To infer object depth on a real image, we place the segmented object into the learned background prompt and run off-the-shelf depth networks.

Object

Large-Scale Automatic Audiobook Creation

no code implementations7 Sep 2023 Brendan Walsh, Mark Hamilton, Greg Newby, Xi Wang, Serena Ruan, Sheng Zhao, Lei He, Shaofei Zhang, Eric Dettinger, William T. Freeman, Markus Weimer

In this work, we present a system that can automatically generate high-quality audiobooks from online e-books.

One-step Diffusion with Distribution Matching Distillation

no code implementations30 Nov 2023 Tianwei Yin, Michaël Gharbi, Richard Zhang, Eli Shechtman, Fredo Durand, William T. Freeman, Taesung Park

We introduce Distribution Matching Distillation (DMD), a procedure to transform a diffusion model into a one-step image generator with minimal impact on image quality.

Alchemist: Parametric Control of Material Properties with Diffusion Models

no code implementations5 Dec 2023 Prafull Sharma, Varun Jampani, Yuanzhen Li, Xuhui Jia, Dmitry Lagun, Fredo Durand, William T. Freeman, Mark Matthews

We propose a method to control material attributes of objects like roughness, metallic, albedo, and transparency in real images.

FeatUp: A Model-Agnostic Framework for Features at Any Resolution

1 code implementation15 Mar 2024 Stephanie Fu, Mark Hamilton, Laura Brandt, Axel Feldman, Zhoutong Zhang, William T. Freeman

Deep features are a cornerstone of computer vision research, capturing image semantics and enabling the community to solve downstream tasks even in the zero- or few-shot regime.

Depth Estimation Depth Prediction +5

Cannot find the paper you are looking for? You can Submit a new open access paper.