Search Results for author: Jianming Zhang

Found 79 papers, 35 papers with code

CLIFFNet for Monocular Depth Estimation with Hierarchical Embedding Loss

no code implementations ECCV 2020 Lijun Wang, Jianming Zhang, Yifan Wang, Huchuan Lu, Xiang Ruan

This paper proposes a hierarchical loss for monocular depth estimation, which measures the differences between the prediction and ground truth in hierarchical embedding spaces of depth maps.

Monocular Depth Estimation

Object-level Scene Deocclusion

no code implementations11 Jun 2024 Zhengzhe Liu, Qing Liu, Chirui Chang, Jianming Zhang, Daniil Pakhomov, Haitian Zheng, Zhe Lin, Daniel Cohen-Or, Chi-Wing Fu

Deoccluding the hidden portions of objects in a scene is a formidable task, particularly when addressing real-world scenes.

3D Scene Reconstruction Object +1

SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing

no code implementations8 Apr 2024 Jing Gu, Yilin Wang, Nanxuan Zhao, Wei Xiong, Qing Liu, Zhifei Zhang, He Zhang, Jianming Zhang, HyunJoon Jung, Xin Eric Wang

Compared with existing methods for personalized subject swapping, SwapAnything has three unique advantages: (1) precise control of arbitrary objects and parts rather than the main subject, (2) more faithful preservation of context pixels, (3) better adaptation of the personalized concept to the image.

Image Generation Object

UniHuman: A Unified Model for Editing Human Images in the Wild

1 code implementation CVPR 2024 Nannan Li, Qing Liu, Krishna Kumar Singh, Yilin Wang, Jianming Zhang, Bryan A. Plummer, Zhe Lin

In this paper, we propose UniHuman, a unified model that addresses multiple facets of human image editing in real-world settings.


Relightful Harmonization: Lighting-aware Portrait Background Replacement

no code implementations CVPR 2024 Mengwei Ren, Wei Xiong, Jae Shin Yoon, Zhixin Shu, Jianming Zhang, HyunJoon Jung, Guido Gerig, He Zhang

Portrait harmonization aims to composite a subject into a new background, adjusting its lighting and color to ensure harmony with the background scene.

Fast View Synthesis of Casual Videos with Soup-of-Planes

no code implementations4 Dec 2023 Yao-Chih Lee, Zhoutong Zhang, Kevin Blackburn-Matzen, Simon Niklaus, Jianming Zhang, Jia-Bin Huang, Feng Liu

Specifically, we build a global static scene model using an extended plane-based scene representation to synthesize temporally coherent novel video.

Novel View Synthesis

Lasagna: Layered Score Distillation for Disentangled Object Relighting

1 code implementation30 Nov 2023 Dina Bashkirova, Arijit Ray, Rupayan Mallick, Sarah Adel Bargal, Jianming Zhang, Ranjay Krishna, Kate Saenko

Although generative editing methods now enable some forms of image editing, relighting is still beyond today's capabilities; existing methods struggle to keep other aspects of the image -- colors, shapes, and textures -- consistent after the edit.

Colorization Object +1

Diffusion-Augmented Depth Prediction with Sparse Annotations

no code implementations4 Aug 2023 Jiaqi Li, Yiran Wang, Zihao Huang, Jinghong Zheng, Ke Xian, Zhiguo Cao, Jianming Zhang

We leverage the structural characteristics of diffusion model to enforce depth structures of depth models in a plug-and-play manner.

Autonomous Driving Depth Estimation +3

LightPainter: Interactive Portrait Relighting with Freehand Scribble

no code implementations CVPR 2023 Yiqun Mei, He Zhang, Xuaner Zhang, Jianming Zhang, Zhixin Shu, Yilin Wang, Zijun Wei, Shi Yan, HyunJoon Jung, Vishal M. Patel

Recent portrait relighting methods have achieved realistic results of portrait lighting effects given a desired lighting representation such as an environment map.

PixHt-Lab: Pixel Height Based Light Effect Generation for Image Compositing

no code implementations CVPR 2023 Yichen Sheng, Jianming Zhang, Julien Philip, Yannick Hold-Geoffroy, Xin Sun, He Zhang, Lu Ling, Bedrich Benes

To compensate for the lack of geometry in 2D Image compositing, recent deep learning-based approaches introduced a pixel height representation to generate soft shadows and reflections.

Single View Scene Scale Estimation Using Scale Field

no code implementations CVPR 2023 Byeong-Uk Lee, Jianming Zhang, Yannick Hold-Geoffroy, In So Kweon

In this paper, we propose a single image scale estimation method based on a novel scale field representation.

Lens Parameter Estimation for Realistic Depth of Field Modeling

no code implementations ICCV 2023 Dominique Piché-Meunier, Yannick Hold-Geoffroy, Jianming Zhang, Jean-François Lalonde

Instead, we go further and propose to use a lens-based representation that models the depth of field using two parameters: the blur factor and focus disparity.

ObjectStitch: Object Compositing With Diffusion Model

no code implementations CVPR 2023 Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, Daniel Aliaga

Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and shadow generation to generate realistic results.

Data Augmentation Object

GAIT: Generating Aesthetic Indoor Tours with Deep Reinforcement Learning

no code implementations ICCV 2023 Desai Xie, Ping Hu, Xin Sun, Soren Pirk, Jianming Zhang, Radomir Mech, Arie E. Kaufman

Placing and orienting a camera to compose aesthetically meaningful shots of a scene is not only a key objective in real-world photography and cinematography but also for virtual content creation.

Mixed Reality reinforcement-learning

ObjectStitch: Generative Object Compositing

1 code implementation2 Dec 2022 Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, Daniel Aliaga

Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and shadow generation to generate realistic results.

Data Augmentation Object

SceneComposer: Any-Level Semantic Image Synthesis

no code implementations CVPR 2023 Yu Zeng, Zhe Lin, Jianming Zhang, Qing Liu, John Collomosse, Jason Kuen, Vishal M. Patel

We propose a new framework for conditional image synthesis from semantic layouts of any precision levels, ranging from pure text to a 2D semantic canvas with precise shapes.

Image Generation

Towards Accurate Reconstruction of 3D Scene Shape from A Single Monocular Image

1 code implementation28 Aug 2022 Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Simon Chen, Yifan Liu, Chunhua Shen

To do so, we propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image, and then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes.

Depth Estimation Depth Prediction

Text-to-Image Generation via Implicit Visual Guidance and Hypernetwork

no code implementations17 Aug 2022 Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, John Collomosse

We develop an approach for text-to-image generation that embraces additional retrieval images, driven by a combination of implicit visual guidance loss and generative objectives.

Diversity Retrieval +1

Towards Domain-agnostic Depth Completion

1 code implementation29 Jul 2022 Guangkai Xu, Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Simon Chen, Jia-Wang Bian

Our method leverages a data-driven prior in the form of a single image depth prediction network trained on large-scale datasets, the output of which is used as an input to our model.

Depth Completion Depth Estimation +2

MPIB: An MPI-Based Bokeh Rendering Framework for Realistic Partial Occlusion Effects

1 code implementation18 Jul 2022 Juewen Peng, Jianming Zhang, Xianrui Luo, Hao Lu, Ke Xian, Zhiguo Cao

Partial occlusion effects are a phenomenon that blurry objects near a camera are semi-transparent, resulting in partial appearance of occluded background.

Controllable Shadow Generation Using Pixel Height Maps

no code implementations12 Jul 2022 Yichen Sheng, Yifan Liu, Jianming Zhang, Wei Yin, A. Cengiz Oztireli, He Zhang, Zhe Lin, Eli Shechtman, Bedrich Benes

It can be used to calculate hard shadows in a 2D image based on the projective geometry, providing precise control of the shadows' direction and shape.

Dynamic Gradient Reactivation for Backward Compatible Person Re-identification

no code implementations12 Jul 2022 Xiao Pan, Hao Luo, Weihua Chen, Fan Wang, Hao Li, Wei Jiang, Jianming Zhang, Jianyang Gu, Peike Li

To address this issue, we propose the Ranking-based Backward Compatible Learning (RBCL), which directly optimizes the ranking metric between new features and old features.

Person Re-Identification Retrieval

BokehMe: When Neural Rendering Meets Classical Rendering

1 code implementation CVPR 2022 Juewen Peng, Zhiguo Cao, Xianrui Luo, Hao Lu, Ke Xian, Jianming Zhang

Based on this formulation, we implement the classical renderer by a scattering-based method and propose a two-stage neural renderer to fix the erroneous areas from the classical renderer.

Neural Rendering

CM-GAN: Image Inpainting with Cascaded Modulation GAN and Object-Aware Training

1 code implementation22 Mar 2022 Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Ning Xu, Sohrab Amirghodsi, Jiebo Luo

We propose cascaded modulation GAN (CM-GAN), a new network design consisting of an encoder with Fourier convolution blocks that extract multi-scale feature representations from the input image with holes and a dual-stream decoder with a novel cascaded global-spatial modulation block at each scale level.

Decoder Image Inpainting

Interactive Portrait Harmonization

no code implementations15 Mar 2022 Jeya Maria Jose Valanarasu, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Jose Echevarria, Yinglan Ma, Zijun Wei, Kalyan Sunkavalli, Vishal M. Patel

To enable flexible interaction between user and harmonization, we introduce interactive harmonization, a new setting where the harmonization is performed with respect to a selected \emph{region} in the reference image instead of the entire background.

Image Harmonization

Lite Vision Transformer with Enhanced Self-Attention

1 code implementation CVPR 2022 Chenglin Yang, Yilin Wang, Jianming Zhang, He Zhang, Zijun Wei, Zhe Lin, Alan Yuille

We propose Lite Vision Transformer (LVT), a novel light-weight transformer network with two enhanced self-attention mechanisms to improve the model performances for mobile deployment.

Panoptic Segmentation Segmentation

SSH: A Self-Supervised Framework for Image Harmonization

1 code implementation ICCV 2021 Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang

Image harmonization aims to improve the quality of image compositing by matching the "appearance" (\eg, color tone, brightness and contrast) between foreground and background images.

Benchmarking Data Augmentation +1

Black-Box Diagnosis and Calibration on GAN Intra-Mode Collapse: A Pilot Study

1 code implementation23 Jul 2021 Zhenyu Wu, Zhaowen Wang, Ye Yuan, Jianming Zhang, Zhangyang Wang, Hailin Jin

Existing diversity tests of samples from GANs are usually conducted qualitatively on a small scale, and/or depends on the access to original training data as well as the trained model parameters.

Image Generation

Single-image Full-body Human Relighting

no code implementations15 Jul 2021 Manuel Lagunas, Xin Sun, Jimei Yang, Ruben Villegas, Jianming Zhang, Zhixin Shu, Belen Masia, Diego Gutierrez

We present a single-image data-driven method to automatically relight images with full-body humans in them.

Image Reconstruction

Multimodal Contrastive Training for Visual Representation Learning

no code implementations CVPR 2021 Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, Yilin Wang, Michael Maire, Ajinkya Kale, Baldo Faieta

We first train our model on COCO and evaluate the learned visual representations on various downstream tasks including image classification, object detection, and instance segmentation.

Cross-Modal Retrieval Image Classification +6

Learning to Recover 3D Scene Shape from a Single Image

1 code implementation CVPR 2021 Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Long Mai, Simon Chen, Chunhua Shen

Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length.

 Ranked #1 on Indoor Monocular Depth Estimation on DIODE (using extra training data)

3D Scene Reconstruction Depth Prediction +3

Semantic Layout Manipulation with High-Resolution Sparse Attention

1 code implementation14 Dec 2020 Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Jianming Zhang, Ning Xu, Jiebo Luo

A core problem of this task is how to transfer visual details from the input images to the new semantic layout while making the resulting image visually realistic.

Decoder Vocal Bursts Intensity Prediction

Meticulous Object Segmentation

1 code implementation13 Dec 2020 Chenglin Yang, Yilin Wang, Jianming Zhang, He Zhang, Zhe Lin, Alan Yuille

To evaluate segmentation quality near object boundaries, we propose the Meticulosity Quality (MQ) score considering both the mask coverage and boundary precision.

2k 4k +5

Mask Guided Matting via Progressive Refinement Network

1 code implementation CVPR 2021 Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, Alan Yuille

We propose Mask Guided (MG) Matting, a robust matting framework that takes a general coarse mask as guidance.

Image Matting

Deep Image Compositing

no code implementations4 Nov 2020 He Zhang, Jianming Zhang, Federico Perazzi, Zhe Lin, Vishal M. Patel

In this paper, we propose a new method which can automatically generate high-quality image compositing without any user input.

Image Matting

Attribute-conditioned Layout GAN for Automatic Graphic Design

no code implementations11 Sep 2020 Jianan Li, Jimei Yang, Jianming Zhang, Chang Liu, Christina Wang, Tingfa Xu

In this paper, we introduce Attribute-conditioned Layout GAN to incorporate the attributes of design elements for graphic layout generation by forcing both the generator and the discriminator to meet attribute conditions.


Adversarial Knowledge Transfer from Unlabeled Data

1 code implementation13 Aug 2020 Akash Gupta, Rameswar Panda, Sujoy Paul, Jianming Zhang, Amit K. Roy-Chowdhury

While machine learning approaches to visual recognition offer great promise, most of the existing methods rely heavily on the availability of large quantities of labeled training data.

Transfer Learning

Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions

1 code implementation ECCV 2020 Xihui Liu, Zhe Lin, Jianming Zhang, Handong Zhao, Quan Tran, Xiaogang Wang, Hongsheng Li

We propose a novel algorithm, named Open-Edit, which is the first attempt on open-domain image manipulation with open-vocabulary instructions.

Decoder Image Manipulation

Shape Adaptor: A Learnable Resizing Module

1 code implementation ECCV 2020 Shikun Liu, Zhe Lin, Yilin Wang, Jianming Zhang, Federico Perazzi, Edward Johns

We present a novel resizing module for neural networks: shape adaptor, a drop-in enhancement built on top of traditional resizing layers, such as pooling, bilinear sampling, and strided convolution.

Image Classification Neural Architecture Search +1

SSN: Soft Shadow Network for Image Compositing

1 code implementation CVPR 2021 Yichen Sheng, Jianming Zhang, Bedrich Benes

We demonstrate that our model produces realistic soft shadows in real-time.

Scaling Object Detection by Transferring Classification Weights

1 code implementation ICCV 2019 Jason Kuen, Federico Perazzi, Zhe Lin, Jianming Zhang, Yap-Peng Tan

Large scale object detection datasets are constantly increasing their size in terms of the number of classes and annotations count.

Classification General Classification +3

Multi-Channel Deep Networks for Block-Based Image Compressive Sensing

1 code implementation28 Aug 2019 Siwang Zhou, Yan He, Yonghe Liu, Chengqing Li, Jianming Zhang

Specifically, with our multichannel structure, the image blocks with a variety of sampling rates can be reconstructed in a single model.

Blocking Compressive Sensing +2

Towards High-Resolution Salient Object Detection

1 code implementation ICCV 2019 Yi Zeng, Pingping Zhang, Jianming Zhang, Zhe Lin, Huchuan Lu

This paper pushes forward high-resolution saliency detection, and contributes a new dataset, named High-Resolution Salient Object Detection (HRSOD).

Ranked #13 on RGB Salient Object Detection on DAVIS-S (using extra training data)

Object object-detection +4

M2KD: Multi-model and Multi-level Knowledge Distillation for Incremental Learning

no code implementations3 Apr 2019 Peng Zhou, Long Mai, Jianming Zhang, Ning Xu, Zuxuan Wu, Larry S. Davis

Instead of sequentially distilling knowledge only from the last model, we directly leverage all previous model snapshots.

Incremental Learning Knowledge Distillation

Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization

1 code implementation CVPR 2019 Siyuan Qiao, Zhe Lin, Jianming Zhang, Alan Yuille

By simply replacing standard optimizers with Neural Rejuvenation, we are able to improve the performances of neural networks by a very large margin while using similar training efforts and maintaining their original resource usages.

Network Pruning Neural Architecture Search

Sequence-to-Segment Networks for Segment Detection

no code implementations NeurIPS 2018 Zijun Wei, Boyu Wang, Minh Hoai Nguyen, Jianming Zhang, Zhe Lin, Xiaohui Shen, Radomir Mech, Dimitris Samaras

Detecting segments of interest from an input sequence is a challenging problem which often requires not only good knowledge of individual target segments, but also contextual understanding of the entire input sequence and the relationships between the target segments.

Decoder Temporal Action Proposal Generation +1

DeepLens: Shallow Depth Of Field From A Single Image

no code implementations18 Oct 2018 Lijun Wang, Xiaohui Shen, Jianming Zhang, Oliver Wang, Zhe Lin, Chih-Yao Hsieh, Sarah Kong, Huchuan Lu

To achieve this, we propose a novel neural network model comprised of a depth prediction module, a lens blur module, and a guided upsampling module.

Depth Estimation Depth Prediction

GAPLE: Generalizable Approaching Policy LEarning for Robotic Object Searching in Indoor Environment

no code implementations21 Sep 2018 Xin Ye, Zhe Lin, Joon-Young Lee, Jianming Zhang, Shibin Zheng, Yezhou Yang

We study the problem of learning a generalizable action policy for an intelligent agent to actively approach an object of interest in an indoor environment solely from its visual inputs.

Semantic Segmentation Visual Navigation

Learning to Blend Photos

1 code implementation ECCV 2018 Wei-Chih Hung, Jianming Zhang, Xiaohui Shen, Zhe Lin, Joon-Young Lee, Ming-Hsuan Yang

Specifically, given a foreground image and a background image, our proposed method automatically generates a set of blending photos with scores that indicate the aesthetics quality with the proposed quality network and policy network.

Concept Mask: Large-Scale Segmentation from Semantic Concepts

no code implementations ECCV 2018 Yufei Wang, Zhe Lin, Xiaohui Shen, Jianming Zhang, Scott Cohen

Then, we refine and extend the embedding network to predict an attention map, using a curated dataset with bounding box annotations on 750 concepts.

Image Segmentation Segmentation +1

Contemplating Visual Emotions: Understanding and Overcoming Dataset Bias

no code implementations ECCV 2018 Rameswar Panda, Jianming Zhang, Haoxiang Li, Joon-Young Lee, Xin Lu, Amit K. Roy-Chowdhury

While machine learning approaches to visual emotion recognition offer great promise, current methods consider training and testing models on small scale datasets covering limited visual emotion concepts.

Emotion Recognition

Excitation Dropout: Encouraging Plasticity in Deep Neural Networks

1 code implementation23 May 2018 Andrea Zunino, Sarah Adel Bargal, Pietro Morerio, Jianming Zhang, Stan Sclaroff, Vittorio Murino

In this work, we utilize the evidence at each neuron to determine the probability of dropout, rather than dropping out neurons uniformly at random as in standard dropout.

Decision Making Video Recognition

Excitation Backprop for RNNs

1 code implementation CVPR 2018 Sarah Adel Bargal, Andrea Zunino, Donghyun Kim, Jianming Zhang, Vittorio Murino, Stan Sclaroff

Models are trained to caption or classify activity in videos, but little is known about the evidence used to make such decisions.

Action Recognition Temporal Action Localization +1

Predicting Foreground Object Ambiguity and Efficiently Crowdsourcing the Segmentation(s)

no code implementations30 Apr 2017 Danna Gurari, Kun He, Bo Xiong, Jianming Zhang, Mehrnoosh Sameki, Suyog Dutt Jain, Stan Sclaroff, Margrit Betke, Kristen Grauman

We propose the ambiguity problem for the foreground object segmentation task and motivate the importance of estimating and accounting for this ambiguity when designing vision systems.

Diversity Object +2

Top-down Visual Saliency Guided by Captions

6 code implementations CVPR 2017 Vasili Ramanishka, Abir Das, Jianming Zhang, Kate Saenko

Neural image/video captioning models can generate accurate descriptions, but their internal process of mapping regions to words is a black box and therefore difficult to explain.

Decoder Sentence +1

Top-down Neural Attention by Excitation Backprop

3 code implementations1 Aug 2016 Jianming Zhang, Zhe Lin, Jonathan Brandt, Xiaohui Shen, Stan Sclaroff

We aim to model the top-down attention of a Convolutional Neural Network (CNN) classifier for generating task-specific attention maps.

Salient Object Subitizing

no code implementations CVPR 2015 Jianming Zhang, Shugao Ma, Mehrnoosh Sameki, Stan Sclaroff, Margrit Betke, Zhe Lin, Xiaohui Shen, Brian Price, Radomir Mech

We study the problem of Salient Object Subitizing, i. e. predicting the existence and the number of salient objects in an image using holistic cues.

Image Retrieval Object +4

Minimum Barrier Salient Object Detection at 80 FPS

no code implementations ICCV 2015 Jianming Zhang, Stan Sclaroff, Zhe Lin, Xiaohui Shen, Brian Price, Radomir Mech

Powered by this fast MBD transform algorithm, the proposed salient object detection method runs at 80 FPS, and significantly outperforms previous methods with similar speed on four large benchmark datasets, and achieves comparable or better performance than state-of-the-art methods.

Ranked #6 on Video Salient Object Detection on VOS-T (using extra training data)

Object object-detection +2

Cannot find the paper you are looking for? You can Submit a new open access paper.