Search Results for author: Zhe Lin

Found 160 papers, 71 papers with code

Deep Bag-of-Words Model: An Efficient and Interpretable Relevance Architecture for Chinese E-Commerce

no code implementations12 Jul 2024 Zhe Lin, Jiwei Tan, Dan Ou, Xi Chen, Shaowei Yao, Bo Zheng

Text relevance or text matching of query and product is an essential technique for the e-commerce search system to ensure that the displayed products can match the intent of the query.

Computational Efficiency Language Modelling +1

ControlVAR: Exploring Controllable Visual Autoregressive Modeling

no code implementations14 Jun 2024 Xiang Li, Kai Qiu, Hao Chen, Jason Kuen, Zhe Lin, Rita Singh, Bhiksha Raj

Conditional visual generation has witnessed remarkable progress with the advent of diffusion models (DMs), especially in tasks like control-to-image generation.

Image Generation

Object-level Scene Deocclusion

no code implementations11 Jun 2024 Zhengzhe Liu, Qing Liu, Chirui Chang, Jianming Zhang, Daniil Pakhomov, Haitian Zheng, Zhe Lin, Daniel Cohen-Or, Chi-Wing Fu

Deoccluding the hidden portions of objects in a scene is a formidable task, particularly when addressing real-world scenes.

3D Scene Reconstruction Object +1

Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models

no code implementations CVPR 2024 Hongjie Wang, Difan Liu, Yan Kang, Yijun Li, Zhe Lin, Niraj K. Jha, Yuchen Liu

Specifically, for single-denoising-step pruning, we develop a novel ranking algorithm, Generalized Weighted Page Rank (G-WPR), to identify redundant tokens, and a similarity-based recovery method to restore tokens for the convolution operation.


Hierarchical Source-to-Post-Route QoR Prediction in High-Level Synthesis with GNNs

1 code implementation14 Jan 2024 Mingzhe Gao, Jieru Zhao, Zhe Lin, Minyi Guo

High-level synthesis (HLS) notably speeds up the hardware design process by avoiding RTL programming.

graph construction

UniHuman: A Unified Model for Editing Human Images in the Wild

1 code implementation CVPR 2024 Nannan Li, Qing Liu, Krishna Kumar Singh, Yilin Wang, Jianming Zhang, Bryan A. Plummer, Zhe Lin

In this paper, we propose UniHuman, a unified model that addresses multiple facets of human image editing in real-world settings.


Latent Feature-Guided Diffusion Models for Shadow Removal

no code implementations4 Dec 2023 Kangfu Mei, Luis Figueroa, Zhe Lin, Zhihong Ding, Scott Cohen, Vishal M. Patel

Recovering textures under shadows has remained a challenging problem due to the difficulty of inferring shadow-free scenes from shadow images.

Shadow Removal

SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis

no code implementations6 Nov 2023 Hanrong Ye, Jason Kuen, Qing Liu, Zhe Lin, Brian Price, Dan Xu

On the highly competitive ADE20K and COCO benchmarks, our data generation method markedly improves the performance of state-of-the-art segmentation models in semantic segmentation, panoptic segmentation, and instance segmentation.

Diversity Image Generation +4

SCoRD: Subject-Conditional Relation Detection with Text-Augmented Data

1 code implementation24 Aug 2023 Ziyan Yang, Kushal Kafle, Zhe Lin, Scott Cohen, Zhihong Ding, Vicente Ordonez

To solve this problem, we propose an auto-regressive model that given a subject, it predicts its relations, objects, and object locations by casting this output as a sequence of tokens.

Object Relation

AIMS: All-Inclusive Multi-Level Segmentation

1 code implementation28 May 2023 Lu Qi, Jason Kuen, Weidong Guo, Jiuxiang Gu, Zhe Lin, Bo Du, Yu Xu, Ming-Hsuan Yang

Despite the progress of image segmentation for accurate visual entity segmentation, completing the diverse requirements of image editing applications for different-level region-of-interest selections remains unsolved.

Image Segmentation Segmentation +1

XFormer: Fast and Accurate Monocular 3D Body Capture

no code implementations18 May 2023 Lihui Qian, Xintong Han, Faqiang Wang, Hongyu Liu, Haoye Dong, Zhiwen Li, Huawei Wei, Zhe Lin, Cheng-Bin Jin

We present XFormer, a novel human mesh and motion capture method that achieves real-time performance on consumer CPUs given only monocular images as input.

3D Human Pose Estimation

Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis

1 code implementation ICCV 2023 Qiucheng Wu, Yujian Liu, Handong Zhao, Trung Bui, Zhe Lin, Yang Zhang, Shiyu Chang

We then impose spatial attention control by combining the attention over the entire text description and that over the local description of the particular object in the corresponding pixel region of that object.

Denoising Image Generation

TopNet: Transformer-based Object Placement Network for Image Compositing

1 code implementation CVPR 2023 Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen

Given a background image and a segmented object, the goal is to train a model to predict plausible placements (location and scale) of the object for compositing.


Video-P2P: Video Editing with Cross-attention Control

1 code implementation CVPR 2024 Shaoteng Liu, Yuechen Zhang, Wenbo Li, Zhe Lin, Jiaya Jia

This paper presents Video-P2P, a novel framework for real-world video editing with cross-attention control.

Image Generation Video Editing +1

Human MotionFormer: Transferring Human Motions with Vision Transformers

1 code implementation22 Feb 2023 Hongyu Liu, Xintong Han, ChengBin Jin, Lihui Qian, Huawei Wei, Zhe Lin, Faqiang Wang, Haoye Dong, Yibing Song, Jia Xu, Qifeng Chen

In this paper, we propose Human MotionFormer, a hierarchical ViT framework that leverages global and local perceptions to capture large and subtle motion matching, respectively.

Decoder Motion Synthesis

ObjectStitch: Object Compositing With Diffusion Model

no code implementations CVPR 2023 Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, Daniel Aliaga

Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and shadow generation to generate realistic results.

Data Augmentation Object

High Quality Entity Segmentation

no code implementations ICCV 2023 Lu Qi, Jason Kuen, Tiancheng Shen, Jiuxiang Gu, Wenbo Li, Weidong Guo, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang

Given the high-quality and -resolution nature of the dataset, we propose CropFormer which is designed to tackle the intractability of instance-level segmentation on high-resolution images.

Image Segmentation Segmentation +1

Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models

1 code implementation CVPR 2023 Qiucheng Wu, Yujian Liu, Handong Zhao, Ajinkya Kale, Trung Bui, Tong Yu, Zhe Lin, Yang Zhang, Shiyu Chang

Based on this finding, we further propose a simple, light-weight image editing algorithm where the mixing weights of the two text embeddings are optimized for style matching and content preservation.

Denoising Disentanglement

SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model

no code implementations CVPR 2023 Shaoan Xie, Zhifei Zhang, Zhe Lin, Tobias Hinz, Kun Zhang

By contrast, multi-modal inpainting provides more flexible and useful controls on the inpainted content, \eg, a text prompt can be used to describe an object with richer attributes, and a mask can be used to constrain the shape of the inpainted object rather than being only considered as a missing area.

Image Inpainting Object +1

Image Inpainting via Iteratively Decoupled Probabilistic Modeling

2 code implementations6 Dec 2022 Wenbo Li, Xin Yu, Kun Zhou, Yibing Song, Zhe Lin, Jiaya Jia

To achieve high-quality results with low computational cost, we present a novel pixel spread model (PSM) that iteratively employs decoupled probabilistic modeling, combining the optimization efficiency of GANs with the prediction tractability of probabilistic models.

Denoising Image Inpainting

ObjectStitch: Generative Object Compositing

1 code implementation2 Dec 2022 Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, Daniel Aliaga

Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and shadow generation to generate realistic results.

Data Augmentation Object

SceneComposer: Any-Level Semantic Image Synthesis

no code implementations CVPR 2023 Yu Zeng, Zhe Lin, Jianming Zhang, Qing Liu, John Collomosse, Jason Kuen, Vishal M. Patel

We propose a new framework for conditional image synthesis from semantic layouts of any precision levels, ranging from pure text to a 2D semantic canvas with precise shapes.

Image Generation

High-Quality Entity Segmentation

1 code implementation10 Nov 2022 Lu Qi, Jason Kuen, Weidong Guo, Tiancheng Shen, Jiuxiang Gu, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang

It improves mask prediction by fusing high-res image crops that provide more fine-grained image details and the full image.

Image Segmentation Segmentation +2

3D-FM GAN: Towards 3D-Controllable Face Manipulation

no code implementations24 Aug 2022 Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Richard Zhang, S. Y. Kung

While concatenating GAN inversion and a 3D-aware, noise-to-image GAN is a straight-forward solution, it is inefficient and may lead to noticeable drop in editing quality.

Text-to-Image Generation via Implicit Visual Guidance and Hypernetwork

no code implementations17 Aug 2022 Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, John Collomosse

We develop an approach for text-to-image generation that embraces additional retrieval images, driven by a combination of implicit visual guidance loss and generative objectives.

Diversity Retrieval +1

HyperNST: Hyper-Networks for Neural Style Transfer

no code implementations9 Aug 2022 Dan Ruta, Andrew Gilbert, Saeid Motiian, Baldo Faieta, Zhe Lin, John Collomosse

We present HyperNST; a neural style transfer (NST) technique for the artistic stylization of images, based on Hyper-networks and the StyleGAN2 architecture.

Style Transfer

Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-Curation

no code implementations6 Aug 2022 Lingzhi Zhang, Connelly Barnes, Kevin Wampler, Sohrab Amirghodsi, Eli Shechtman, Zhe Lin, Jianbo Shi

Recently, deep models have established SOTA performance for low-resolution image inpainting, but they lack fidelity at resolutions associated with modern cameras such as 4K or more, and for large holes.

4k Image Inpainting

Perceptual Artifacts Localization for Inpainting

1 code implementation5 Aug 2022 Lingzhi Zhang, Yuqian Zhou, Connelly Barnes, Sohrab Amirghodsi, Zhe Lin, Eli Shechtman, Jianbo Shi

Inspired by this workflow, we propose a new learning task of automatic segmentation of inpainting perceptual artifacts, and apply the model for inpainting model evaluation and iterative refinement.

Image Inpainting

Controllable Shadow Generation Using Pixel Height Maps

no code implementations12 Jul 2022 Yichen Sheng, Yifan Liu, Jianming Zhang, Wei Yin, A. Cengiz Oztireli, He Zhang, Zhe Lin, Eli Shechtman, Bedrich Benes

It can be used to calculate hard shadows in a 2D image based on the projective geometry, providing precise control of the shadows' direction and shape.

Shape-guided Object Inpainting

no code implementations16 Apr 2022 Yu Zeng, Zhe Lin, Vishal M. Patel

Therefore, we propose a new data preparation method and a novel Contextual Object Generator (CogNet) for the object inpainting task.

Image Inpainting Object

GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing

no code implementations31 Mar 2022 Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen

To move a step further, this paper proposes GALA (Geometry-and-Lighting-Aware), a generic foreground object search method with discriminative modeling on geometry and lighting compatibility for open-world image compositing.


CM-GAN: Image Inpainting with Cascaded Modulation GAN and Object-Aware Training

1 code implementation22 Mar 2022 Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Ning Xu, Sohrab Amirghodsi, Jiebo Luo

We propose cascaded modulation GAN (CM-GAN), a new network design consisting of an encoder with Fourier convolution blocks that extract multi-scale feature representations from the input image with holes and a dual-stream decoder with a novel cascaded global-spatial modulation block at each scale level.

Decoder Image Inpainting

CoGS: Controllable Generation and Search from Sketch and Style

1 code implementation17 Mar 2022 Cusuh Ham, Gemma Canet Tarres, Tu Bui, James Hays, Zhe Lin, John Collomosse

CoGS enables exploration of diverse appearance possibilities for a given sketched object, enabling decoupled control over the structure and the appearance of the output.

Decoder Object

Interactive Portrait Harmonization

no code implementations15 Mar 2022 Jeya Maria Jose Valanarasu, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Jose Echevarria, Yinglan Ma, Zijun Wei, Kalyan Sunkavalli, Vishal M. Patel

To enable flexible interaction between user and harmonization, we introduce interactive harmonization, a new setting where the harmonization is performed with respect to a selected \emph{region} in the reference image instead of the entire background.

Image Harmonization

StyleBabel: Artistic Style Tagging and Captioning

no code implementations10 Mar 2022 Dan Ruta, Andrew Gilbert, Pranav Aggarwal, Naveen Marri, Ajinkya Kale, Jo Briggs, Chris Speed, Hailin Jin, Baldo Faieta, Alex Filipkowski, Zhe Lin, John Collomosse

We present StyleBabel, a unique open access dataset of natural language captions and free-form tags describing the artistic style of over 135K digital artworks, collected via a novel participatory method from experts studying at specialist art and design schools.

Attribute Representation Learning +2

PowerGear: Early-Stage Power Estimation in FPGA HLS via Heterogeneous Edge-Centric GNNs

1 code implementation25 Jan 2022 Zhe Lin, Zike Yuan, Jieru Zhao, Wei zhang, Hui Wang, Yonghong Tian

Specifically, in the graph construction flow, we introduce buffer insertion, datapath merging, graph trimming and feature annotation techniques to transform HLS designs into graph-structured data, which encode both intra-operation micro-architectures and inter-operation interconnects annotated with switching activities.

graph construction Graph Learning +1

Visual Information Guided Zero-Shot Paraphrase Generation

1 code implementation COLING 2022 Zhe Lin, Xiaojun Wan

Zero-shot paraphrase generation has drawn much attention as the large-scale high-quality paraphrase corpus is limited.

Diversity Image Captioning +2

Lite Vision Transformer with Enhanced Self-Attention

1 code implementation CVPR 2022 Chenglin Yang, Yilin Wang, Jianming Zhang, He Zhang, Zijun Wei, Zhe Lin, Alan Yuille

We propose Lite Vision Transformer (LVT), a novel light-weight transformer network with two enhanced self-attention mechanisms to improve the model performances for mobile deployment.

Panoptic Segmentation Segmentation

CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation

1 code implementation9 Dec 2021 Lu Qi, Jason Kuen, Zhe Lin, Jiuxiang Gu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen, Ming-Hsuan Yang, Jiaya Jia

To improve instance-level detection/segmentation performance, existing self-supervised and semi-supervised methods extract either task-unrelated or task-specific training signals from unlabeled data.

object-detection Object Detection +2

SketchEdit: Mask-Free Local Image Manipulation with Partial Sketches

no code implementations CVPR 2022 Yu Zeng, Zhe Lin, Vishal M. Patel

Our model can be trained in a self-supervised fashion by learning the reconstruction of an image region from the style vector and sketch.

Image Manipulation

Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling

1 code implementation CVPR 2022 Dat Huynh, Jason Kuen, Zhe Lin, Jiuxiang Gu, Ehsan Elhamifar

To address this, we propose a cross-modal pseudo-labeling framework, which generates training pseudo masks by aligning word semantics in captions with visual features of object masks in images.

Instance Segmentation Semantic Segmentation

Pushing Paraphrase Away from Original Sentence: A Multi-Round Paraphrase Generation Approach

1 code implementation Findings (ACL) 2021 Zhe Lin, Xiaojun Wan

Both automatic and human evaluation show BTmPG can improve the diversity of paraphrase while preserving the semantics of the original sentence.

Diversity Paraphrase Generation +2

SSH: A Self-Supervised Framework for Image Harmonization

1 code implementation ICCV 2021 Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang

Image harmonization aims to improve the quality of image compositing by matching the "appearance" (\eg, color tone, brightness and contrast) between foreground and background images.

Benchmarking Data Augmentation +1

Open-World Entity Segmentation

2 code implementations29 Jul 2021 Lu Qi, Jason Kuen, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia

By removing the need of class label prediction, the models trained for such task can focus more on improving segmentation quality.

Image Manipulation Image Segmentation +2

Learning to Predict Visual Attributes in the Wild

no code implementations CVPR 2021 Khoi Pham, Kushal Kafle, Zhe Lin, Zhihong Ding, Scott Cohen, Quan Tran, Abhinav Shrivastava

In this paper, we introduce a large-scale in-the-wild visual attribute prediction dataset consisting of over 927K attribute annotations for over 260K object instances.

Attribute Contrastive Learning +2

Multimodal Contrastive Training for Visual Representation Learning

no code implementations CVPR 2021 Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, Yilin Wang, Michael Maire, Ajinkya Kale, Baldo Faieta

We first train our model on COCO and evaluate the learned visual representations on various downstream tasks including image classification, object detection, and instance segmentation.

Cross-Modal Retrieval Image Classification +6

Content-Aware GAN Compression

1 code implementation CVPR 2021 Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Federico Perazzi, S. Y. Kung

We then propose a novel content-aware method to guide the processes of both pruning and distillation.

Image Generation Image Manipulation +1

Going Deeper Into Face Detection: A Survey

no code implementations27 Mar 2021 Shervin Minaee, Ping Luo, Zhe Lin, Kevin Bowyer

In this work, we provide a detailed overview of some of the most representative deep learning based face detection methods by grouping them into a few major categories, and present their core architectural designs and accuracies on popular benchmarks.

Face Detection Image Classification

Language-Guided Global Image Editing via Cross-Modal Cyclic Mechanism

no code implementations ICCV 2021 Wentao Jiang, Ning Xu, Jiayun Wang, Chen Gao, Jing Shi, Zhe Lin, Si Liu

Given the cycle, we propose several free augmentation strategies to help our model understand various editing requests given the imbalanced dataset.

CR-Fill: Generative Image Inpainting With Auxiliary Contextual Reconstruction

1 code implementation ICCV 2021 Yu Zeng, Zhe Lin, Huchuan Lu, Vishal M. Patel

The auxiliary branch (i. e. CR loss) is required only during training, and only the inpainting generator is required during the inference.

Image Inpainting

Face Image Retrieval With Attribute Manipulation

no code implementations ICCV 2021 Alireza Zaeemzadeh, Shabnam Ghadar, Baldo Faieta, Zhe Lin, Nazanin Rahnavard, Mubarak Shah, Ratheesh Kalarot

For example, a user can ask for retrieving images similar to a query image, but with a different hair color, and no preference for absence/presence of eyeglasses in the results.

Attribute Face Image Retrieval +1

Semantic Layout Manipulation with High-Resolution Sparse Attention

1 code implementation14 Dec 2020 Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Jianming Zhang, Ning Xu, Jiebo Luo

A core problem of this task is how to transfer visual details from the input images to the new semantic layout while making the resulting image visually realistic.

Decoder Vocal Bursts Intensity Prediction

Meticulous Object Segmentation

1 code implementation13 Dec 2020 Chenglin Yang, Yilin Wang, Jianming Zhang, He Zhang, Zhe Lin, Alan Yuille

To evaluate segmentation quality near object boundaries, we propose the Meticulosity Quality (MQ) score considering both the mask coverage and boundary precision.

2k 4k +5

Mask Guided Matting via Progressive Refinement Network

1 code implementation CVPR 2021 Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, Alan Yuille

We propose Mask Guided (MG) Matting, a robust matting framework that takes a general coarse mask as guidance.

Image Matting

Hard-ODT: Hardware-Friendly Online Decision Tree Learning Algorithm and System

no code implementations11 Dec 2020 Zhe Lin, Sharad Sinha, Wei zhang

Following this, we present Hard-ODT, a high-performance, hardware-efficient and scalable online decision tree learning system on a field-programmable gate array (FPGA) with system-level optimization techniques.

On the Helpfulness of Document Context to Sentence Simplification

1 code implementation COLING 2020 Renliang Sun, Zhe Lin, Xiaojun Wan

Our model uses neural networks to learn the different effects of the preceding sentences and the following sentences on the current sentence and applies them to the improved transformer model.

Sentence Text Simplification

CR-Fill: Generative Image Inpainting with Auxiliary Contexutal Reconstruction

1 code implementation25 Nov 2020 Yu Zeng, Zhe Lin, Huchuan Lu, Vishal M. Patel

Due to the lack of supervision signals for the correspondence between missing regions and known regions, it may fail to find proper reference features, which often leads to artifacts in the results.

Image Inpainting

Deep Image Compositing

no code implementations4 Nov 2020 He Zhang, Jianming Zhang, Federico Perazzi, Zhe Lin, Vishal M. Patel

In this paper, we propose a new method which can automatically generate high-quality image compositing without any user input.

Image Matting

An Ensemble Learning Approach for In-situ Monitoring of FPGA Dynamic Power

no code implementations3 Sep 2020 Zhe Lin, Sharad Sinha, Wei zhang

As field-programmable gate arrays become prevalent in critical application domains, their power consumption is of high concern.

Ensemble Learning Management

Towards Efficient and Scalable Acceleration of Online Decision Tree Learning on FPGA

no code implementations3 Sep 2020 Zhe Lin, Sharad Sinha, Wei zhang

We further present a high-performance, hardware-efficient and scalable online decision tree learning system on a field-programmable gate array (FPGA) with system-level optimization techniques.

Decision Tree Based Hardware Power Monitoring for Run Time Dynamic Power Management in FPGA

no code implementations3 Sep 2020 Zhe Lin, Wei zhang, Sharad Sinha

A flexible architecture of the hardware power monitoring is proposed, which can be instrumented in any RTL design for runtime power estimation, dispensing with the need for extra power measurement devices.


Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions

1 code implementation ECCV 2020 Xihui Liu, Zhe Lin, Jianming Zhang, Handong Zhao, Quan Tran, Xiaogang Wang, Hongsheng Li

We propose a novel algorithm, named Open-Edit, which is the first attempt on open-domain image manipulation with open-vocabulary instructions.

Decoder Image Manipulation

PhraseCut: Language-based Image Segmentation in the Wild

1 code implementation CVPR 2020 Chenyun Wu, Zhe Lin, Scott Cohen, Trung Bui, Subhransu Maji

We consider the problem of segmenting image regions given a natural language phrase, and study it on a novel dataset of 77, 262 images and 345, 486 phrase-region pairs.

Attribute Diversity +3

Shape Adaptor: A Learnable Resizing Module

1 code implementation ECCV 2020 Shikun Liu, Zhe Lin, Yilin Wang, Jianming Zhang, Federico Perazzi, Edward Johns

We present a novel resizing module for neural networks: shape adaptor, a drop-in enhancement built on top of traditional resizing layers, such as pooling, bilinear sampling, and strided convolution.

Image Classification Neural Architecture Search +1

Incorporating Reinforced Adversarial Learning in Autoregressive Image Generation

no code implementations ECCV 2020 Kenan E. Ak, Ning Xu, Zhe Lin, Yilin Wang

To our best knowledge, the proposed method is first to enable adversarial learning in autoregressive models for image generation.

Image Generation

Real-time Semantic Segmentation with Fast Attention

1 code implementation7 Jul 2020 Ping Hu, Federico Perazzi, Fabian Caba Heilbron, Oliver Wang, Zhe Lin, Kate Saenko, Stan Sclaroff

The proposed architecture relies on our fast spatial attention, which is a simple yet efficient modification of the popular self-attention mechanism and captures the same rich spatial context at a small fraction of the computational cost, by changing the order of operations.

Real-Time Semantic Segmentation Segmentation

Context-Aware Group Captioning via Self-Attention and Contrastive Features

no code implementations CVPR 2020 Zhuowan Li, Quan Tran, Long Mai, Zhe Lin, Alan Yuille

In this paper, we introduce a new task, context-aware group captioning, which aims to describe a group of target images in the context of another group of related reference images.

Image Captioning

Scaling Object Detection by Transferring Classification Weights

1 code implementation ICCV 2019 Jason Kuen, Federico Perazzi, Zhe Lin, Jianming Zhang, Yap-Peng Tan

Large scale object detection datasets are constantly increasing their size in terms of the number of classes and annotations count.

Classification General Classification +3

Towards High-Resolution Salient Object Detection

1 code implementation ICCV 2019 Yi Zeng, Pingping Zhang, Jianming Zhang, Zhe Lin, Huchuan Lu

This paper pushes forward high-resolution saliency detection, and contributes a new dataset, named High-Resolution Salient Object Detection (HRSOD).

Ranked #13 on RGB Salient Object Detection on DAVIS-S (using extra training data)

Object object-detection +4

Expressing Visual Relationships via Language

1 code implementation ACL 2019 Hao Tan, Franck Dernoncourt, Zhe Lin, Trung Bui, Mohit Bansal

To push forward the research in this direction, we first introduce a new language-guided image editing dataset that contains a large number of real image pairs with corresponding editing instructions.

Decoder Image Captioning +1

Multitask Text-to-Visual Embedding with Titles and Clickthrough Data

no code implementations30 May 2019 Pranav Aggarwal, Zhe Lin, Baldo Faieta, Saeid Motiian

In this paper, we propose a new method for learning text-visual embedding using both image titles and click-through data from an image search engine.

Image Retrieval Retrieval

Multimodal Style Transfer via Graph Cuts

2 code implementations ICCV 2019 Yulun Zhang, Chen Fang, Yilin Wang, Zhaowen Wang, Zhe Lin, Yun Fu, Jimei Yang

An assumption widely used in recent neural style transfer methods is that image styles can be described by global statics of deep features like Gram or covariance matrices.

Style Transfer

Scene Graph Generation with External Knowledge and Image Reconstruction

no code implementations CVPR 2019 Jiuxiang Gu, Handong Zhao, Zhe Lin, Sheng Li, Jianfei Cai, Mingyang Ling

Scene graph generation has received growing attention with the advancements in image understanding tasks such as object detection, attributes and relationship prediction,~\etc.

Graph Generation Image Reconstruction +6

Image Super-Resolution by Neural Texture Transfer

2 code implementations CVPR 2019 Zhifei Zhang, Zhaowen Wang, Zhe Lin, Hairong Qi

Reference-based super-resolution (RefSR), on the other hand, has proven to be promising in recovering high-resolution (HR) details when a reference (Ref) image with similar content as that of the LR input is given.

Image Stylization Image Super-Resolution +1

Foreground-aware Image Inpainting

no code implementations CVPR 2019 Wei Xiong, Jiahui Yu, Zhe Lin, Jimei Yang, Xin Lu, Connelly Barnes, Jiebo Luo

We show that by such disentanglement, the contour completion model predicts reasonable contours of objects, and further substantially improves the performance of image inpainting.

Disentanglement Image Inpainting

Photo-Sketching: Inferring Contour Drawings from Images

3 code implementations2 Jan 2019 Mengtian Li, Zhe Lin, Radomir Mech, Ersin Yumer, Deva Ramanan

Edges, boundaries and contours are important subjects of study in both computer graphics and computer vision.

Boundary Detection Diversity

Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization

1 code implementation CVPR 2019 Siyuan Qiao, Zhe Lin, Jianming Zhang, Alan Yuille

By simply replacing standard optimizers with Neural Rejuvenation, we are able to improve the performances of neural networks by a very large margin while using similar training efforts and maintaining their original resource usages.

Network Pruning Neural Architecture Search

Sequence-to-Segment Networks for Segment Detection

no code implementations NeurIPS 2018 Zijun Wei, Boyu Wang, Minh Hoai Nguyen, Jianming Zhang, Zhe Lin, Xiaohui Shen, Radomir Mech, Dimitris Samaras

Detecting segments of interest from an input sequence is a challenging problem which often requires not only good knowledge of individual target segments, but also contextual understanding of the entire input sequence and the relationships between the target segments.

Decoder Temporal Action Proposal Generation +1

Spatial-temporal Multi-Task Learning for Within-field Cotton Yield Prediction

no code implementations16 Nov 2018 Long Nguyen, Jia Zhen, Zhe Lin, Hanxiang Du, Zhou Yang, Wenxuan Guo, Fang Jin

Understanding and accurately predicting within-field spatial variability of crop yield play a key role in site-specific management of crop inputs such as irrigation water and fertilizer for optimized crop production.

Crop Yield Prediction Management +1

DeepLens: Shallow Depth Of Field From A Single Image

no code implementations18 Oct 2018 Lijun Wang, Xiaohui Shen, Jianming Zhang, Oliver Wang, Zhe Lin, Chih-Yao Hsieh, Sarah Kong, Huchuan Lu

To achieve this, we propose a novel neural network model comprised of a depth prediction module, a lens blur module, and a guided upsampling module.

Depth Estimation Depth Prediction

GAPLE: Generalizable Approaching Policy LEarning for Robotic Object Searching in Indoor Environment

no code implementations21 Sep 2018 Xin Ye, Zhe Lin, Joon-Young Lee, Jianming Zhang, Shibin Zheng, Yezhou Yang

We study the problem of learning a generalizable action policy for an intelligent agent to actively approach an object of interest in an indoor environment solely from its visual inputs.

Semantic Segmentation Visual Navigation

Compositing-aware Image Search

no code implementations ECCV 2018 Hengshuang Zhao, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Brian Price, Jiaya Jia

We present a new image search technique that, given a background image, returns compatible foreground objects for image compositing tasks.

Image Retrieval Object

Learning to Blend Photos

1 code implementation ECCV 2018 Wei-Chih Hung, Jianming Zhang, Xiaohui Shen, Zhe Lin, Joon-Young Lee, Ming-Hsuan Yang

Specifically, given a foreground image and a background image, our proposed method automatically generates a set of blending photos with scores that indicate the aesthetics quality with the proposed quality network and policy network.

Concept Mask: Large-Scale Segmentation from Semantic Concepts

no code implementations ECCV 2018 Yufei Wang, Zhe Lin, Xiaohui Shen, Jianming Zhang, Scott Cohen

Then, we refine and extend the embedding network to predict an attention map, using a curated dataset with bounding box annotations on 750 concepts.

Image Segmentation Segmentation +1

Active Object Perceiver: Recognition-guided Policy Learning for Object Searching on Mobile Robots

no code implementations30 Jul 2018 Xin Ye, Zhe Lin, Haoxiang Li, Shibin Zheng, Yezhou Yang

We study the problem of learning a navigation policy for a robot to actively search for an object of interest in an indoor environment solely from its visual inputs.

Object Object Recognition +1

Learning to Understand Image Blur

no code implementations CVPR 2018 Shanghang Zhang, Xiaohui Shen, Zhe Lin, Radomír Měch, João P. Costeira, José M. F. Moura

In this paper, we propose a unified framework to estimate a spatially-varying blur map and understand its desirability in terms of image quality at the same time.

Reference-Conditioned Super-Resolution by Neural Texture Transfer

no code implementations10 Apr 2018 Zhifei Zhang, Zhaowen Wang, Zhe Lin, Hairong Qi

We focus on transferring the high-resolution texture from reference images to the super-resolution process without the constraint of content similarity between reference and target images, which is a key difference from previous example-based methods.

Image Stylization Image Super-Resolution +1

The AdobeIndoorNav Dataset: Towards Deep Reinforcement Learning based Real-world Indoor Robot Visual Navigation

1 code implementation24 Feb 2018 Kaichun Mo, Haoxiang Li, Zhe Lin, Joon-Young Lee

Synthetic data suffers from domain gap to the real-world scenes while visual inputs rendered from 3D reconstructed scenes have undesired holes and artifacts.


Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers

3 code implementations ICLR 2018 Jianbo Ye, Xin Lu, Zhe Lin, James Z. Wang

Model pruning has become a useful technique that improves the computational efficiency of deep learning, making it possible to deploy solutions in resource-limited scenarios.

Computational Efficiency

Generative Image Inpainting with Contextual Attention

28 code implementations CVPR 2018 Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang

Motivated by these observations, we propose a new deep generative model-based approach which can not only synthesize novel image structures but also explicitly utilize surrounding image features as references during network training to make better predictions.

Image Inpainting

Contextual-based Image Inpainting: Infer, Match, and Translate

no code implementations ECCV 2018 Yuhang Song, Chao Yang, Zhe Lin, Xiaofeng Liu, Qin Huang, Hao Li, C. -C. Jay Kuo

We study the task of image inpainting, which is to fill in the missing region of an incomplete image with plausible contents.

Image Inpainting Translation

Predicting Scene Parsing and Motion Dynamics in the Future

no code implementations NeurIPS 2017 Xiaojie Jin, Huaxin Xiao, Xiaohui Shen, Jimei Yang, Zhe Lin, Yunpeng Chen, Zequn Jie, Jiashi Feng, Shuicheng Yan

The ability of predicting the future is important for intelligent systems, e. g. autonomous vehicles and robots to plan early and make decisions accordingly.

Autonomous Vehicles motion prediction +2

Scene Parsing with Global Context Embedding

1 code implementation ICCV 2017 Wei-Chih Hung, Yi-Hsuan Tsai, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Xin Lu, Ming-Hsuan Yang

We present a scene parsing method that utilizes global context information based on both the parametric and non- parametric models.

Scene Parsing

Personalized Image Aesthetics

no code implementations ICCV 2017 Jian Ren, Xiaohui Shen, Zhe Lin, Radomir Mech, David J. Foran

To accommodate our study, we first collect two distinct datasets, a large image dataset from Flickr and annotated by Amazon Mechanical Turk, and a small dataset of real personal albums rated by owners.

Active Learning

FoveaNet: Perspective-aware Urban Scene Parsing

no code implementations ICCV 2017 Xin Li, Zequn Jie, Wei Wang, Changsong Liu, Jimei Yang, Xiaohui Shen, Zhe Lin, Qiang Chen, Shuicheng Yan, Jiashi Feng

Thus, they suffer from heterogeneous object scales caused by perspective projection of cameras on actual scenes and inevitably encounter parsing failures on distant objects as well as other boundary and recognition errors.

Scene Parsing

Recognizing and Curating Photo Albums via Event-Specific Image Importance

1 code implementation19 Jul 2017 Yufei Wang, Zhe Lin, Xiaohui Shen, Radomir Mech, Gavin Miller, Garrison W. Cottrell

Automatic organization of personal photos is a problem with many real world ap- plications, and can be divided into two main tasks: recognizing the event type of the photo collection, and selecting interesting images from the collection.

Vocal Bursts Type Prediction

Spatial-Semantic Image Search by Visual Feature Synthesis

no code implementations CVPR 2017 Long Mai, Hailin Jin, Zhe Lin, Chen Fang, Jonathan Brandt, Feng Liu

We train a convolutional neural network to synthesize appropriate visual features that captures the spatial-semantic constraints from the user canvas query.

Image Retrieval Retrieval

Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition

no code implementations CVPR 2017 Yufei Wang, Zhe Lin, Xiaohui Shen, Scott Cohen, Garrison W. Cottrell

Furthermore, our algorithm can generate descriptions with varied length, benefiting from the separate control of the skeleton and attributes.

Attribute Image Captioning +2

Recurrent Multimodal Interaction for Referring Image Segmentation

1 code implementation ICCV 2017 Chenxi Liu, Zhe Lin, Xiaohui Shen, Jimei Yang, Xin Lu, Alan Yuille

In this paper we are interested in the problem of image segmentation given natural language descriptions, i. e. referring expressions.

Image Segmentation Segmentation +1

Learning to Detect Multiple Photographic Defects

1 code implementation6 Dec 2016 Ning Yu, Xiaohui Shen, Zhe Lin, Radomir Mech, Connelly Barnes

Our new dataset enables us to formulate the problem as a multi-task learning problem and train a multi-column deep convolutional neural network (CNN) to simultaneously predict the severity of all the defects.

Defect Detection Multi-Task Learning

Video Scene Parsing with Predictive Feature Learning

no code implementations ICCV 2017 Xiaojie Jin, Xin Li, Huaxin Xiao, Xiaohui Shen, Zhe Lin, Jimei Yang, Yunpeng Chen, Jian Dong, Luoqi Liu, Zequn Jie, Jiashi Feng, Shuicheng Yan

In this way, the network can effectively learn to capture video dynamics and temporal context, which are critical clues for video scene parsing, without requiring extra manual annotations.

Representation Learning Scene Parsing

High-Resolution Image Inpainting using Multi-Scale Neural Patch Synthesis

1 code implementation CVPR 2017 Chao Yang, Xin Lu, Zhe Lin, Eli Shechtman, Oliver Wang, Hao Li

Recent advances in deep learning have shown exciting promise in filling large holes in natural images with semantically plausible and context aware details, impacting fundamental image manipulation tasks such as object removal.

Image Inpainting Image Manipulation +1

Proposing Plausible Answers for Open-ended Visual Question Answering

no code implementations20 Oct 2016 Omid Bakhshandeh, Trung Bui, Zhe Lin, Walter Chang

One of the most interesting recent open-ended question answering challenges is Visual Question Answering (VQA) which attempts to evaluate a system's visual understanding through its answers to natural language questions about images.

Graph Matching Open-Ended Question Answering +1

Top-down Neural Attention by Excitation Backprop

3 code implementations1 Aug 2016 Jianming Zhang, Zhe Lin, Jonathan Brandt, Xiaohui Shen, Stan Sclaroff

We aim to model the top-down attention of a Convolutional Neural Network (CNN) classifier for generating task-specific attention maps.

Salient Object Subitizing

no code implementations CVPR 2015 Jianming Zhang, Shugao Ma, Mehrnoosh Sameki, Stan Sclaroff, Margrit Betke, Zhe Lin, Xiaohui Shen, Brian Price, Radomir Mech

We study the problem of Salient Object Subitizing, i. e. predicting the existence and the number of salient objects in an image using holistic cues.

Image Retrieval Object +4

Progressive Attention Networks for Visual Attribute Prediction

1 code implementation8 Jun 2016 Paul Hongsuck Seo, Zhe Lin, Scott Cohen, Xiaohui Shen, Bohyung Han

We propose a novel attention model that can accurately attends to target objects of various scales and shapes in images.

Attribute Hard Attention

Photo Aesthetics Ranking Network with Attributes and Content Adaptation

2 code implementations6 Jun 2016 Shu Kong, Xiaohui Shen, Zhe Lin, Radomir Mech, Charless Fowlkes

In this work, we propose to learn a deep convolutional neural network to rank photo aesthetics in which the relative ranking of photo aesthetics are directly modeled in the loss function.

Aesthetics Quality Assessment

Shortlist Selection With Residual-Aware Distance Estimator for K-Nearest Neighbor Search

no code implementations CVPR 2016 Jae-Pil Heo, Zhe Lin, Xiaohui Shen, Jonathan Brandt, Sung-Eui Yoon

We have tested the proposed method with the inverted index and multi-index on a diverse set of benchmarks including up to one billion data points with varying dimensions, and found that our method robustly improves the accuracy of shortlists (up to 127% relatively higher) over the state-of-the-art techniques with a comparable or even faster computational cost.


Event-Specific Image Importance

no code implementations CVPR 2016 Yufei Wang, Zhe Lin, Xiaohui Shen, Radomir Mech, Gavin Miller, Garrison W. Cottrell

In this paper, we show that the selection of important images is consistent among different viewers, and that this selection process is related to the event type of the album.

A Multi-Level Contextual Model For Person Recognition in Photo Albums

no code implementations CVPR 2016 Haoxiang Li, Jonathan Brandt, Zhe Lin, Xiaohui Shen, Gang Hua

Our new framework enables efficient use of these complementary multi-level contextual cues to improve overall recognition rates on the photo album person recognition task, as demonstrated through state-of-the-art results on a challenging public dataset.

Person Recognition

Multi-Instance Visual-Semantic Embedding

no code implementations22 Dec 2015 Zhou Ren, Hailin Jin, Zhe Lin, Chen Fang, Alan Yuille

Visual-semantic embedding models have been recently proposed and shown to be effective for image classification and zero-shot learning, by mapping images into a continuous semantic label space.

General Classification Image Classification +1

Minimum Barrier Salient Object Detection at 80 FPS

no code implementations ICCV 2015 Jianming Zhang, Stan Sclaroff, Zhe Lin, Xiaohui Shen, Brian Price, Radomir Mech

Powered by this fast MBD transform algorithm, the proposed salient object detection method runs at 80 FPS, and significantly outperforms previous methods with similar speed on four large benchmark datasets, and achieves comparable or better performance than state-of-the-art methods.

Ranked #6 on Video Salient Object Detection on VOS-T (using extra training data)

Object object-detection +2

Automatic Content-Aware Color and Tone Stylization

no code implementations CVPR 2016 Joon-Young Lee, Kalyan Sunkavalli, Zhe Lin, Xiaohui Shen, In So Kweon

We introduce a new technique that automatically generates diverse, visually compelling stylizations for a photograph in an unsupervised manner.

Style Transfer

LCNN: Low-level Feature Embedded CNN for Salient Object Detection

no code implementations17 Aug 2015 Hongyang Li, Huchuan Lu, Zhe Lin, Xiaohui Shen, Brian Price

In this paper, we propose a novel deep neural network framework embedded with low-level features (LCNN) for salient object detection in complex images.

object-detection RGB Salient Object Detection +1

A Convolutional Neural Network Cascade for Face Detection

no code implementations CVPR 2015 Haoxiang Li, Zhe Lin, Xiaohui Shen, Jonathan Brandt, Gang Hua

To improve localization effectiveness, and reduce the number of candidates at later stages, we introduce a CNN-based calibration stage after each of the detection stages in the cascade.

Face Detection

Towards Unified Depth and Semantic Prediction From a Single Image

no code implementations CVPR 2015 Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan L. Yuille

By allowing for interactions between the depth and semantic information, the joint network provides more accurate depth prediction than a state-of-the-art CNN trained solely for depth prediction [5].

Depth Estimation Depth Prediction +1

PatchCut: Data-Driven Object Segmentation via Local Shape Transfer

no code implementations CVPR 2015 Jimei Yang, Brian Price, Scott Cohen, Zhe Lin, Ming-Hsuan Yang

The transferred local shape masks constitute a patch-level segmentation solution space and we thus develop a novel cascade algorithm, PatchCut, for coarse-to-fine object segmentation.

Object Object Discovery +2

Inner and Inter Label Propagation: Salient Object Detection in the Wild

2 code implementations27 May 2015 Hongyang Li, Huchuan Lu, Zhe Lin, Xiaohui Shen, Brian Price

For most natural images, some boundary superpixels serve as the background labels and the saliency of other superpixels are determined by ranking their similarities to the boundary labels based on an inner propagation scheme.

Computational Efficiency object-detection +4

Joint Object and Part Segmentation using Deep Learned Potentials

no code implementations ICCV 2015 Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan Yuille

Segmenting semantic objects from images and parsing them into their respective semantic parts are fundamental steps towards detailed object understanding in computer vision.

Object Segmentation +1

Collaborative Feature Learning from Social Media

no code implementations CVPR 2015 Chen Fang, Hailin Jin, Jianchao Yang, Zhe Lin

We validate our feature learning paradigm on this dataset and find that the learned feature significantly outperforms the state-of-the-art image features in learning better image similarities.

Distance Encoded Product Quantization

no code implementations CVPR 2014 Jae-Pil Heo, Zhe Lin, Sung-Eui Yoon

This result is achieved mainly because our method accurately estimates distances between two data points with the new binary codes and distance metric.


Nonparametric Context Modeling of Local Appearance for Pose- and Expression-Robust Facial Landmark Localization

no code implementations CVPR 2014 Brandon M. Smith, Jonathan Brandt, Zhe Lin, Li Zhang

We propose a data-driven approach to facial landmark localization that models the correlations between each landmark and its surrounding appearance features.

Face Alignment

Efficient Boosted Exemplar-based Face Detection

no code implementations CVPR 2014 Haoxiang Li, Zhe Lin, Jonathan Brandt, Xiaohui Shen, Gang Hua

Despite the fact that face detection has been studied intensively over the past several decades, the problem is still not completely solved.

Face Detection

Scalable Similarity Learning using Large Margin Neighborhood Embedding

no code implementations24 Apr 2014 Zhaowen Wang, Jianchao Yang, Zhe Lin, Jonathan Brandt, Shiyu Chang, Thomas Huang

In this paper, we present an image similarity learning method that can scale well in both the number of images and the dimensionality of image descriptors.

Metric Learning

GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training

no code implementations21 Dec 2013 Thomas Paine, Hailin Jin, Jianchao Yang, Zhe Lin, Thomas Huang

The ability to train large-scale neural networks has resulted in state-of-the-art performance in many areas of computer vision.

Probabilistic Elastic Matching for Pose Variant Face Verification

no code implementations CVPR 2013 Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang

By augmenting each feature with its location, a Gaussian mixture model (GMM) is trained to capture the spatialappearance distribution of all face images in the training corpus.

Face Recognition Face Verification

Detecting and Aligning Faces by Image Retrieval

no code implementations CVPR 2013 Xiaohui Shen, Zhe Lin, Jonathan Brandt, Ying Wu

In order to overcome these challenges, we present a novel and robust exemplarbased face detector that integrates image retrieval and discriminative learning.

Attribute Face Alignment +5

Exemplar-Based Face Parsing

no code implementations CVPR 2013 Brandon M. Smith, Li Zhang, Jonathan Brandt, Zhe Lin, Jianchao Yang

Given a test image, our algorithm first selects a subset of exemplar images from the database, Our algorithm then computes a nonrigid warp for each exemplar image to align it with the test image.

Face Alignment Face Parsing +3