Search Results for author: Krishna Kumar Singh

Found 35 papers, 13 papers with code

Generative Portrait Shadow Removal

no code implementations7 Oct 2024 Jae Shin Yoon, Zhixin Shu, Mengwei Ren, Xuaner Zhang, Yannick Hold-Geoffroy, Krishna Kumar Singh, He Zhang

For robust and natural shadow removal, we propose to train the diffusion model with a compositional repurposing framework: a pre-trained text-guided image generation model is first fine-tuned to harmonize the lighting and color of the foreground with a background scene by using a background harmonization dataset; and then the model is further fine-tuned to generate a shadow-free portrait image via a shadow-paired dataset.

Image Generation Shadow Removal

Removing Distributional Discrepancies in Captions Improves Image-Text Alignment

no code implementations1 Oct 2024 Yuheng Li, Haotian Liu, Mu Cai, Yijun Li, Eli Shechtman, Zhe Lin, Yong Jae Lee, Krishna Kumar Singh

In this paper, we introduce a model designed to improve the prediction of image-text alignment, targeting the challenge of compositional understanding in current visual-language models.

Language Modelling

ActAnywhere: Subject-Aware Video Background Generation

no code implementations19 Jan 2024 Boxiao Pan, Zhan Xu, Chun-Hao Paul Huang, Krishna Kumar Singh, Yang Zhou, Leonidas J. Guibas, Jimei Yang

Generating video background that tailors to foreground subject motion is an important problem for the movie industry and visual effects community.

UniHuman: A Unified Model for Editing Human Images in the Wild

1 code implementation CVPR 2024 Nannan Li, Qing Liu, Krishna Kumar Singh, Yilin Wang, Jianming Zhang, Bryan A. Plummer, Zhe Lin

In this paper, we propose UniHuman, a unified model that addresses multiple facets of human image editing in real-world settings.

2k

Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion Models

no code implementations10 Dec 2023 Zhipeng Bao, Yijun Li, Krishna Kumar Singh, Yu-Xiong Wang, Martial Hebert

Despite recent significant strides achieved by diffusion-based Text-to-Image (T2I) models, current systems are still less capable of ensuring decent compositional generation aligned with text prompts, particularly for the multi-object generation.

Test-time Adaptation

Consistent Multimodal Generation via A Unified GAN Framework

no code implementations4 Jul 2023 Zhen Zhu, Yijun Li, Weijie Lyu, Krishna Kumar Singh, Zhixin Shu, Soeren Pirk, Derek Hoiem

We investigate how to generate multimodal image outputs, such as RGB, depth, and surface normals, with a single generative model.

multimodal generation

Putting People in Their Place: Affordance-Aware Human Insertion into Scenes

1 code implementation CVPR 2023 Sumith Kulal, Tim Brooks, Alex Aiken, Jiajun Wu, Jimei Yang, Jingwan Lu, Alexei A. Efros, Krishna Kumar Singh

Given a scene image with a marked region and an image of a person, we insert the person into the scene while respecting the scene affordances.

Modulating Pretrained Diffusion Models for Multimodal Image Synthesis

no code implementations24 Feb 2023 Cusuh Ham, James Hays, Jingwan Lu, Krishna Kumar Singh, Zhifei Zhang, Tobias Hinz

We show that MCM enables user control over the spatial layout of the image and leads to increased control over the image generation process.

Image Generation Semantic Segmentation

Complete 3D Human Reconstruction From a Single Incomplete Image

no code implementations CVPR 2023 Junying Wang, Jae Shin Yoon, Tuanfeng Y. Wang, Krishna Kumar Singh, Ulrich Neumann

This paper presents a method to reconstruct a complete human geometry and texture from an image of a person with only partial body observed, e. g., a torso.

3D Human Reconstruction

Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing

1 code implementation CVPR 2022 Gaurav Parmar, Yijun Li, Jingwan Lu, Richard Zhang, Jun-Yan Zhu, Krishna Kumar Singh

We propose a new method to invert and edit such complex images in the latent space of GANs, such as StyleGAN2.

GIRAFFE HD: A High-Resolution 3D-aware Generative Model

1 code implementation CVPR 2022 Yang Xue, Yuheng Li, Krishna Kumar Singh, Yong Jae Lee

3D-aware generative models have shown that the introduction of 3D information can lead to more controllable image generation.

Disentanglement Image Generation +2

InsetGAN for Full-Body Image Generation

2 code implementations CVPR 2022 Anna Frühstück, Krishna Kumar Singh, Eli Shechtman, Niloy J. Mitra, Peter Wonka, Jingwan Lu

Instead of modeling this complex domain with a single GAN, we propose a novel method to combine multiple pretrained GANs, where one GAN generates a global canvas (e. g., human body) and a set of specialized GANs, or insets, focus on different parts (e. g., faces, shoes) that can be seamlessly inserted onto the global canvas.

Diversity Image Generation

IMAGINE: Image Synthesis by Image-Guided Model Inversion

no code implementations CVPR 2021 Pei Wang, Yijun Li, Krishna Kumar Singh, Jingwan Lu, Nuno Vasconcelos

We introduce an inversion based method, denoted as IMAge-Guided model INvErsion (IMAGINE), to generate high-quality and diverse images from only a single training sample.

Image Generation Specificity

Generating Furry Cars: Disentangling Object Shape & Appearance across Multiple Domains

no code implementations5 Apr 2021 Utkarsh Ojha, Krishna Kumar Singh, Yong Jae Lee

We consider the novel task of learning disentangled representations of object shape and appearance across multiple domains (e. g., dogs and cars).

Disentanglement Object

Generating Furry Cars: Disentangling Object Shape and Appearance across Multiple Domains

no code implementations ICLR 2021 Utkarsh Ojha, Krishna Kumar Singh, Yong Jae Lee

We consider the novel task of learning disentangled representations of object shape and appearance across multiple domains (e. g., dogs and cars).

Disentanglement Object

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation

3 code implementations CVPR 2020 Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Lee

We present MixNMatch, a conditional generative model that learns to disentangle and encode background, object pose, shape, and texture from real images with minimal supervision, for mix-and-match image generation.

Conditional Image Generation Disentanglement

Elastic-InfoGAN: Unsupervised Disentangled Representation Learning in Class-Imbalanced Data

1 code implementation NeurIPS 2020 Utkarsh Ojha, Krishna Kumar Singh, Cho-Jui Hsieh, Yong Jae Lee

We propose a novel unsupervised generative model that learns to disentangle object identity from other low-level aspects in class-imbalanced data.

Object Representation Learning

You Reap What You Sow: Using Videos to Generate High Precision Object Proposals for Weakly-Supervised Object Detection

no code implementations CVPR 2019 Krishna Kumar Singh, Yong Jae Lee

We use the W-RPN to generate high precision object proposals, which are in turn used to re-rank high recall proposals like edge boxes or selective search according to their spatial overlap.

Object object-detection +2

FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery

1 code implementation CVPR 2019 Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Lee

We propose FineGAN, a novel unsupervised GAN framework, which disentangles the background, object shape, and object appearance to hierarchically generate images of fine-grained object categories.

Conditional Image Generation Disentanglement +3

DOCK: Detecting Objects by transferring Common-sense Knowledge

no code implementations ECCV 2018 Krishna Kumar Singh, Santosh Divvala, Ali Farhadi, Yong Jae Lee

We present a scalable approach for Detecting Objects by transferring Common-sense Knowledge (DOCK) from source to target categories.

Attribute Common Sense Reasoning +3

Who Will Share My Image? Predicting the Content Diffusion Path in Online Social Networks

no code implementations25 May 2017 Wenjian Hu, Krishna Kumar Singh, Fanyi Xiao, Jinyoung Han, Chen-Nee Chuah, Yong Jae Lee

Content popularity prediction has been extensively studied due to its importance and interest for both users and hosts of social media sites like Facebook, Instagram, Twitter, and Pinterest.

Identifying First-person Camera Wearers in Third-person Videos

no code implementations CVPR 2017 Chenyou Fan, Jang-Won Lee, Mingze Xu, Krishna Kumar Singh, Yong Jae Lee, David J. Crandall, Michael S. Ryoo

We consider scenarios in which we wish to perform joint scene understanding, object tracking, activity recognition, and other tasks in environments in which multiple people are wearing body-worn cameras while a third-person static camera also captures the scene.

Activity Recognition Object Tracking +2

End-to-End Localization and Ranking for Relative Attributes

no code implementations9 Aug 2016 Krishna Kumar Singh, Yong Jae Lee

We propose an end-to-end deep convolutional network to simultaneously localize and rank relative visual attributes, given only weakly-supervised pairwise image comparisons.

Attribute

Cannot find the paper you are looking for? You can Submit a new open access paper.