Search Results for author: Michael Rubinstein

Found 31 papers, 12 papers with code

Probing the 3D Awareness of Visual Foundation Models

1 code implementation12 Apr 2024 Mohamed El Banani, Amit Raj, Kevis-Kokitsi Maninis, Abhishek Kar, Yuanzhen Li, Michael Rubinstein, Deqing Sun, Leonidas Guibas, Justin Johnson, Varun Jampani

Given that such models can classify, delineate, and localize objects in 2D, we ask whether they also represent their 3D structure?

Idempotent Generative Network

1 code implementation2 Nov 2023 Assaf Shocher, Amil Dravid, Yossi Gandelsman, Inbar Mosseri, Michael Rubinstein, Alexei A. Efros

We define the target manifold as the set of all instances that $f$ maps to themselves.

RealFill: Reference-Driven Generation for Authentic Image Completion

no code implementations28 Sep 2023 Luming Tang, Nataniel Ruiz, Qinghao Chu, Yuanzhen Li, Aleksander Holynski, David E. Jacobs, Bharath Hariharan, Yael Pritch, Neal Wadhwa, Kfir Aberman, Michael Rubinstein

Once personalized, RealFill is able to complete a target image with visually compelling contents that are faithful to the original scene.

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

2 code implementations13 Jul 2023 Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Wei Wei, Tingbo Hou, Yael Pritch, Neal Wadhwa, Michael Rubinstein, Kfir Aberman

By composing these weights into the diffusion model, coupled with fast finetuning, HyperDreamBooth can generate a person's face in various contexts and styles, with high subject details while also preserving the model's crucial knowledge of diverse styles and semantic modifications.

Diffusion Personalization Tuning Free

Background Prompting for Improved Object Depth

no code implementations8 Jun 2023 Manel Baradad, Yuanzhen Li, Forrester Cole, Michael Rubinstein, Antonio Torralba, William T. Freeman, Varun Jampani

To infer object depth on a real image, we place the segmented object into the learned background prompt and run off-the-shelf depth networks.

Object

Muse: Text-To-Image Generation via Masked Generative Transformers

4 code implementations2 Jan 2023 Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan

Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.

 Ranked #1 on Text-to-Image Generation on MS-COCO (FID metric)

Language Modelling Large Language Model +1

Hi-LASSIE: High-Fidelity Articulated Shape and Skeleton Discovery from Sparse Image Ensemble

1 code implementation CVPR 2023 Chun-Han Yao, Wei-Chih Hung, Yuanzhen Li, Michael Rubinstein, Ming-Hsuan Yang, Varun Jampani

Automatically estimating 3D skeleton, shape, camera viewpoints, and part articulation from sparse in-the-wild image ensembles is a severely under-constrained and challenging problem.

SCOOP: Self-Supervised Correspondence and Optimization-Based Scene Flow

1 code implementation CVPR 2023 Itai Lang, Dror Aiger, Forrester Cole, Shai Avidan, Michael Rubinstein

Scene flow estimation is a long-standing problem in computer vision, where the goal is to find the 3D motion of a scene from its consecutive observations.

regression Scene Flow Estimation

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

10 code implementations CVPR 2023 Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, Kfir Aberman

Once the subject is embedded in the output domain of the model, the unique identifier can be used to synthesize novel photorealistic images of the subject contextualized in different scenes.

Diffusion Personalization Image Generation

LASSIE: Learning Articulated Shapes from Sparse Image Ensemble via 3D Part Discovery

no code implementations7 Jul 2022 Chun-Han Yao, Wei-Chih Hung, Yuanzhen Li, Michael Rubinstein, Ming-Hsuan Yang, Varun Jampani

In this work, we propose a practical problem setting to estimate 3D pose and shape of animals given only a few (10-30) in-the-wild images of a particular animal species (say, horse).

Disentangling Architecture and Training for Optical Flow

no code implementations21 Mar 2022 Deqing Sun, Charles Herrmann, Fitsum Reda, Michael Rubinstein, David Fleet, William T. Freeman

Our newly trained RAFT achieves an Fl-all score of 4. 31% on KITTI 2015, more accurate than all published optical flow methods at the time of writing.

Optical Flow Estimation

Deep Saliency Prior for Reducing Visual Distraction

no code implementations CVPR 2022 Kfir Aberman, Junfeng He, Yossi Gandelsman, Inbar Mosseri, David E. Jacobs, Kai Kohlhoff, Yael Pritch, Michael Rubinstein

Using only a model that was trained to predict where people look at images, and no additional training data, we can produce a range of powerful editing effects for reducing distraction in images.

Omnimatte: Associating Objects and Their Effects in Video

no code implementations CVPR 2021 Erika Lu, Forrester Cole, Tali Dekel, Andrew Zisserman, William T. Freeman, Michael Rubinstein

We show results on real-world videos containing interactions between different types of subjects (cars, animals, people) and complex effects, ranging from semi-transparent elements such as smoke and reflections, to fully opaque effects such as objects attached to the subject.

Layered Neural Rendering for Retiming People in Video

1 code implementation16 Sep 2020 Erika Lu, Forrester Cole, Tali Dekel, Weidi Xie, Andrew Zisserman, David Salesin, William T. Freeman, Michael Rubinstein

We present a method for retiming people in an ordinary, natural video -- manipulating and editing the time in which different motions of individuals in the video occur.

Neural Rendering

SpeedNet: Learning the Speediness in Videos

1 code implementation CVPR 2020 Sagie Benaim, Ariel Ephrat, Oran Lang, Inbar Mosseri, William T. Freeman, Michael Rubinstein, Michal Irani, Tali Dekel

We demonstrate how those learned features can boost the performance of self-supervised action recognition, and can be used for video retrieval.

Binary Classification Retrieval +2

Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation

5 code implementations10 Apr 2018 Ariel Ephrat, Inbar Mosseri, Oran Lang, Tali Dekel, Kevin Wilson, Avinatan Hassidim, William T. Freeman, Michael Rubinstein

Solving this task using only audio as input is extremely challenging and does not provide an association of the separated speech signals with speakers in the video.

Speech Separation

On the Effectiveness of Visible Watermarks

no code implementations CVPR 2017 Tali Dekel, Michael Rubinstein, Ce Liu, William T. Freeman

Since such an attack relies on the consistency of watermarks across image collection, we explore and evaluate how it is affected by various types of inconsistencies in the watermark embedding that could potentially be used to make watermarking more secured.

Image Matting

Visual Vibrometry: Estimating Material Properties From Small Motion in Video

no code implementations CVPR 2015 Abe Davis, Katherine L. Bouman, Justin G. Chen, Michael Rubinstein, Fredo Durand, William T. Freeman

The estimation of material properties is important for scene understanding, with many applications in vision, robotics, and structural engineering.

Scene Understanding

Unsupervised Joint Object Discovery and Segmentation in Internet Images

no code implementations CVPR 2013 Michael Rubinstein, Armand Joulin, Johannes Kopf, Ce Liu

In contrast to previous co-segmentation methods, our algorithm performs well even in the presence of significant amounts of noise images (images not containing a common object), as typical for datasets collected from Internet search.

Object Object Discovery +1

Cannot find the paper you are looking for? You can Submit a new open access paper.