Search Results for author: Maneesh Agrawala

Found 20 papers, 6 papers with code

ScriptViz: A Visualization Tool to Aid Scriptwriting based on a Large Movie Database

no code implementations4 Oct 2024 Anyi Rao, Jean-Peïc Chou, Maneesh Agrawala

Scriptwriters usually rely on their mental visualization to create a vivid story by using their imagination to see, feel, and experience the scenes they are writing.

A Unified Differentiable Boolean Operator with Fuzzy Logic

no code implementations15 Jul 2024 Hsueh-Ti Derek Liu, Maneesh Agrawala, Cem Yuksel, Tim Omernick, Vinith Misra, Stefano Corazza, Morgan McGuire, Victor Zordan

Drawing inspiration from fuzzy logic, we present a unified boolean operator that outputs a continuous function and is differentiable with respect to operator types.

Block and Detail: Scaffolding Sketch-to-Image Generation

no code implementations28 Feb 2024 Vishnu Sarukkai, Lu Yuan, Mia Tang, Maneesh Agrawala, Kayvon Fatahalian

Our tool lets users sketch blocking strokes to coarsely represent the placement and form of objects and detail strokes to refine their shape and silhouettes.

Blocking Dataset Generation +1

Transparent Image Layer Diffusion using Latent Transparency

3 code implementations27 Feb 2024 Lvmin Zhang, Maneesh Agrawala

We show that latent transparency can be applied to different open source image generators, or be adapted to various conditional control systems to achieve applications like foreground/background-conditioned layer generation, joint layer generation, structural control of layer contents, etc.

FlashTex: Fast Relightable Mesh Texturing with LightControlNet

no code implementations20 Feb 2024 Kangle Deng, Timothy Omernick, Alexander Weiss, Deva Ramanan, Jun-Yan Zhu, Tinghui Zhou, Maneesh Agrawala

We introduce LightControlNet, a new text-to-image model based on the ControlNet architecture, which allows the specification of the desired lighting as a conditioning image to the model.

SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models

1 code implementation28 Nov 2023 Yuwei Guo, Ceyuan Yang, Anyi Rao, Maneesh Agrawala, Dahua Lin, Bo Dai

The development of text-to-video (T2V), i. e., generating videos with a given text prompt, has been significantly advanced in recent years.

Video Generation

Tree-Structured Shading Decomposition

no code implementations ICCV 2023 Chen Geng, Hong-Xing Yu, Sharon Zhang, Maneesh Agrawala, Jiajun Wu

The shade tree representation enables novice users who are unfamiliar with the physical shading process to edit object shading in an efficient and intuitive manner.


Automated Conversion of Music Videos into Lyric Videos

no code implementations28 Aug 2023 Jiaju Ma, Anyi Rao, Li-Yi Wei, Rubaiat Habib Kazi, Hijung Valentina Shin, Maneesh Agrawala

Musicians and fans often produce lyric videos, a form of music videos that showcase the song's lyrics, for their favorite songs.

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning

8 code implementations10 Jul 2023 Yuwei Guo, Ceyuan Yang, Anyi Rao, Zhengyang Liang, Yaohui Wang, Yu Qiao, Maneesh Agrawala, Dahua Lin, Bo Dai

Once trained, the motion module can be inserted into a personalized T2I model to form a personalized animation generator.

Image Animation

Adding Conditional Control to Text-to-Image Diffusion Models

12 code implementations ICCV 2023 Lvmin Zhang, Anyi Rao, Maneesh Agrawala

ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers pretrained with billions of images as a strong backbone to learn a diverse set of conditional controls.

Layout-to-Image Generation

AGQA 2.0: An Updated Benchmark for Compositional Spatio-Temporal Reasoning

no code implementations12 Apr 2022 Madeleine Grunde-McLaughlin, Ranjay Krishna, Maneesh Agrawala

Prior benchmarks have analyzed models' answers to questions about videos in order to measure visual compositional reasoning.

Question Answering

Disentangled3D: Learning a 3D Generative Model with Disentangled Geometry and Appearance from Monocular Images

no code implementations CVPR 2022 Ayush Tewari, Mallikarjun B R, Xingang Pan, Ohad Fried, Maneesh Agrawala, Christian Theobalt

Our model can disentangle the geometry and appearance variations in the scene, i. e., we can independently sample from the geometry and appearance spaces of the generative model.


Iterative Text-based Editing of Talking-heads Using Neural Retargeting

no code implementations21 Nov 2020 Xinwei Yao, Ohad Fried, Kayvon Fatahalian, Maneesh Agrawala

We present a text-based tool for editing talking-head video that enables an iterative editing workflow.

State of the Art on Neural Rendering

no code implementations8 Apr 2020 Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitzmann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, Rohit Pandey, Sean Fanello, Gordon Wetzstein, Jun-Yan Zhu, Christian Theobalt, Maneesh Agrawala, Eli Shechtman, Dan B. Goldman, Michael Zollhöfer

Neural rendering is a new and rapidly emerging field that combines generative machine learning techniques with physical knowledge from computer graphics, e. g., by the integration of differentiable rendering into network training.

BIG-bench Machine Learning Image Generation +2

Rekall: Specifying Video Events using Compositions of Spatiotemporal Labels

1 code implementation7 Oct 2019 Daniel Y. Fu, Will Crichton, James Hong, Xinwei Yao, Haotian Zhang, Anh Truong, Avanika Narayan, Maneesh Agrawala, Christopher Ré, Kayvon Fatahalian

Many real-world video analysis applications require the ability to identify domain-specific events in video, such as interviews and commercials in TV news broadcasts, or action sequences in film.

Text-based Editing of Talking-head Video

1 code implementation4 Jun 2019 Ohad Fried, Ayush Tewari, Michael Zollhöfer, Adam Finkelstein, Eli Shechtman, Dan B. Goldman, Kyle Genova, Zeyu Jin, Christian Theobalt, Maneesh Agrawala

To edit a video, the user has to only edit the transcript, and an optimization strategy then chooses segments of the input corpus as base material.

Face Model Sentence +3

SceneSuggest: Context-driven 3D Scene Design

no code implementations28 Feb 2017 Manolis Savva, Angel X. Chang, Maneesh Agrawala

We present SceneSuggest: an interactive 3D scene design system providing context-driven suggestions for 3D model retrieval and placement.

Graphics Human-Computer Interaction

How do people explore virtual environments?

no code implementations13 Dec 2016 Vincent Sitzmann, Ana Serrano, Amy Pavel, Maneesh Agrawala, Diego Gutierrez, Belen Masia, Gordon Wetzstein

Understanding how people explore immersive virtual environments is crucial for many applications, such as designing virtual reality (VR) content, developing new compression algorithms, or learning computational models of saliency or visual attention.

Video Synopsis

Cannot find the paper you are looking for? You can Submit a new open access paper.