Search Results for author: Maneesh Agrawala

Found 18 papers, 6 papers with code

Block and Detail: Scaffolding Sketch-to-Image Generation

no code implementations • 28 Feb 2024 • Vishnu Sarukkai, Lu Yuan, Mia Tang, Maneesh Agrawala, Kayvon Fatahalian

Our tool lets users sketch blocking strokes to coarsely represent the placement and form of objects and detail strokes to refine their shape and silhouettes.

Blocking Image Generation

Paper
Add Code

Transparent Image Layer Diffusion using Latent Transparency

3 code implementations • 27 Feb 2024 • Lvmin Zhang, Maneesh Agrawala

We show that latent transparency can be applied to different open source image generators, or be adapted to various conditional control systems to achieve applications like foreground/background-conditioned layer generation, joint layer generation, structural control of layer contents, etc.

1,759

Paper
Code

FlashTex: Fast Relightable Mesh Texturing with LightControlNet

no code implementations • 20 Feb 2024 • Kangle Deng, Timothy Omernick, Alexander Weiss, Deva Ramanan, Jun-Yan Zhu, Tinghui Zhou, Maneesh Agrawala

We introduce LightControlNet, a new text-to-image model based on the ControlNet architecture, which allows the specification of the desired lighting as a conditioning image to the model.

Paper
Add Code

SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models

1 code implementation • 28 Nov 2023 • Yuwei Guo, Ceyuan Yang, Anyi Rao, Maneesh Agrawala, Dahua Lin, Bo Dai

The development of text-to-video (T2V), i. e., generating videos with a given text prompt, has been significantly advanced in recent years.

Video Generation

8,733

Paper
Code

Tree-Structured Shading Decomposition

no code implementations • ICCV 2023 • Chen Geng, Hong-Xing Yu, Sharon Zhang, Maneesh Agrawala, Jiajun Wu

The shade tree representation enables novice users who are unfamiliar with the physical shading process to edit object shading in an efficient and intuitive manner.

Object

Paper
Add Code

Automated Conversion of Music Videos into Lyric Videos

no code implementations • 28 Aug 2023 • Jiaju Ma, Anyi Rao, Li-Yi Wei, Rubaiat Habib Kazi, Hijung Valentina Shin, Maneesh Agrawala

Musicians and fans often produce lyric videos, a form of music videos that showcase the song's lyrics, for their favorite songs.

Paper
Add Code

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning

4 code implementations • 10 Jul 2023 • Yuwei Guo, Ceyuan Yang, Anyi Rao, Zhengyang Liang, Yaohui Wang, Yu Qiao, Maneesh Agrawala, Dahua Lin, Bo Dai

Once trained, the motion module can be inserted into a personalized T2I model to form a personalized animation generator.

Image Animation

8,733

Paper
Code

Adding Conditional Control to Text-to-Image Diffusion Models

4 code implementations • ICCV 2023 • Lvmin Zhang, Anyi Rao, Maneesh Agrawala

ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers pretrained with billions of images as a strong backbone to learn a diverse set of conditional controls.

Image Generation

34,524

Paper
Code

Measuring Compositional Consistency for Video Question Answering

no code implementations • CVPR 2022 • Mona Gandhi, Mustafa Omer Gul, Eva Prakash, Madeleine Grunde-McLaughlin, Ranjay Krishna, Maneesh Agrawala

Recent video question answering benchmarks indicate that state-of-the-art models struggle to answer compositional questions.

Question Answering Video Question Answering

Paper
Add Code

AGQA 2.0: An Updated Benchmark for Compositional Spatio-Temporal Reasoning

no code implementations • 12 Apr 2022 • Madeleine Grunde-McLaughlin, Ranjay Krishna, Maneesh Agrawala

Prior benchmarks have analyzed models' answers to questions about videos in order to measure visual compositional reasoning.

Question Answering

Paper
Add Code

Disentangled3D: Learning a 3D Generative Model with Disentangled Geometry and Appearance from Monocular Images

no code implementations • CVPR 2022 • Ayush Tewari, Mallikarjun B R, Xingang Pan, Ohad Fried, Maneesh Agrawala, Christian Theobalt

Our model can disentangle the geometry and appearance variations in the scene, i. e., we can independently sample from the geometry and appearance spaces of the generative model.

Disentanglement

Paper
Add Code

AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning

no code implementations • CVPR 2021 • Madeleine Grunde-McLaughlin, Ranjay Krishna, Maneesh Agrawala

AGQA contains $192M$ unbalanced question answer pairs for $9. 6K$ videos.

Question Answering Video Question Answering +1

Paper
Add Code

Iterative Text-based Editing of Talking-heads Using Neural Retargeting

no code implementations • 21 Nov 2020 • Xinwei Yao, Ohad Fried, Kayvon Fatahalian, Maneesh Agrawala

We present a text-based tool for editing talking-head video that enables an iterative editing workflow.

Paper
Add Code

State of the Art on Neural Rendering

no code implementations • 8 Apr 2020 • Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitzmann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, Rohit Pandey, Sean Fanello, Gordon Wetzstein, Jun-Yan Zhu, Christian Theobalt, Maneesh Agrawala, Eli Shechtman, Dan B. Goldman, Michael Zollhöfer

Neural rendering is a new and rapidly emerging field that combines generative machine learning techniques with physical knowledge from computer graphics, e. g., by the integration of differentiable rendering into network training.

BIG-bench Machine Learning Image Generation +2

Paper
Add Code

Rekall: Specifying Video Events using Compositions of Spatiotemporal Labels

1 code implementation • 7 Oct 2019 • Daniel Y. Fu, Will Crichton, James Hong, Xinwei Yao, Haotian Zhang, Anh Truong, Avanika Narayan, Maneesh Agrawala, Christopher Ré, Kayvon Fatahalian

Many real-world video analysis applications require the ability to identify domain-specific events in video, such as interviews and commercials in TV news broadcasts, or action sequences in film.

Paper
Code

Text-based Editing of Talking-head Video

1 code implementation • 4 Jun 2019 • Ohad Fried, Ayush Tewari, Michael Zollhöfer, Adam Finkelstein, Eli Shechtman, Dan B. Goldman, Kyle Genova, Zeyu Jin, Christian Theobalt, Maneesh Agrawala

To edit a video, the user has to only edit the transcript, and an optimization strategy then chooses segments of the input corpus as base material.

Face Model Sentence +3

409

Paper
Code

SceneSuggest: Context-driven 3D Scene Design

no code implementations • 28 Feb 2017 • Manolis Savva, Angel X. Chang, Maneesh Agrawala

We present SceneSuggest: an interactive 3D scene design system providing context-driven suggestions for 3D model retrieval and placement.

Graphics Human-Computer Interaction

Paper
Add Code

How do people explore virtual environments?

no code implementations • 13 Dec 2016 • Vincent Sitzmann, Ana Serrano, Amy Pavel, Maneesh Agrawala, Diego Gutierrez, Belen Masia, Gordon Wetzstein

Understanding how people explore immersive virtual environments is crucial for many applications, such as designing virtual reality (VR) content, developing new compression algorithms, or learning computational models of saliency or visual attention.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.