Search Results for author: Humphrey Shi

Found 72 papers, 51 papers with code

HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models

no code implementations30 Nov 2023 Zhonghao Wang, Wei Wei, Yang Zhao, Zhisheng Xiao, Mark Hasegawa-Johnson, Humphrey Shi, Tingbo Hou

We further extend our method to a novel image editing task: substituting the subject in an image through textual manipulations.

Video Instance Matting

1 code implementation7 Nov 2023 Jiachen Li, Roberto Henschel, Vidit Goel, Marianna Ohanyan, Shant Navasardyan, Humphrey Shi

To remedy this deficiency, we propose Video Instance Matting~(VIM), that is, estimating alpha mattes of each instance at each frame of a video sequence.

Binarization Image Matting +4

Multi-Concept T2I-Zero: Tweaking Only The Text Embeddings and Nothing Else

no code implementations11 Oct 2023 Hazarapet Tunanyan, Dejia Xu, Shant Navasardyan, Zhangyang Wang, Humphrey Shi

To achieve this goal, we identify the limitations in the text embeddings used for the pre-trained text-to-image diffusion models.

Image Manipulation

Learning Mask-aware CLIP Representations for Zero-Shot Segmentation

1 code implementation NeurIPS 2023 Siyu Jiao, Yunchao Wei, YaoWei Wang, Yao Zhao, Humphrey Shi

However, in the paper, we reveal that CLIP is insensitive to different mask proposals and tends to produce similar predictions for various mask proposals of the same image.

Zero Shot Segmentation

Interactive Neural Painting

no code implementations31 Jul 2023 Elia Peruzzo, Willi Menapace, Vidit Goel, Federica Arrigoni, Hao Tang, Xingqian Xu, Arman Chopikyan, Nikita Orlov, Yuxiao Hu, Humphrey Shi, Nicu Sebe, Elisa Ricci

This paper advances the state of the art in this emerging research domain by proposing the first approach for Interactive NP.

Reference-based Painterly Inpainting via Diffusion: Crossing the Wild Reference Domain Gap

no code implementations20 Jul 2023 Dejia Xu, Xingqian Xu, Wenyan Cong, Humphrey Shi, Zhangyang Wang

We propose Reference-based Painterly Inpainting, a novel task that crosses the wild reference domain gap and implants novel objects into artworks.

Image Inpainting

Matting Anything

1 code implementation8 Jun 2023 Jiachen Li, Jitesh Jain, Humphrey Shi

In this paper, we propose the Matting Anything Model (MAM), an efficient and versatile framework for estimating the alpha matte of any instance in an image with flexible and interactive visual or linguistic user prompt guidance.

Image Matting Referring Image Matting

Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models

1 code implementation25 May 2023 Xingqian Xu, Jiayi Guo, Zhangyang Wang, Gao Huang, Irfan Essa, Humphrey Shi

Text-to-image (T2I) research has grown explosively in the past year, owing to the large-scale pre-trained diffusion models and many emerging personalization and editing approaches.

Conditional Text-to-Image Synthesis Image Generation +3

Zero-shot Generative Model Adaptation via Image-specific Prompt Learning

1 code implementation CVPR 2023 Jiayi Guo, Chaofei Wang, You Wu, Eric Zhang, Kai Wang, Xingqian Xu, Shiji Song, Humphrey Shi, Gao Huang

Recently, CLIP-guided image synthesis has shown appealing performance on adapting a pre-trained source-domain generator to an unseen target domain.

Image Generation

Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models

1 code implementation30 Mar 2023 Eric Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, Humphrey Shi

The unlearning problem of deep learning models, once primarily an academic concern, has become a prevalent issue in the industry.

Disentanglement Memorization

PAIR-Diffusion: A Comprehensive Multimodal Object-Level Image Editor

1 code implementation30 Mar 2023 Vidit Goel, Elia Peruzzo, Yifan Jiang, Dejia Xu, Xingqian Xu, Nicu Sebe, Trevor Darrell, Zhangyang Wang, Humphrey Shi

We propose \textbf{PAIR} Diffusion, a generic framework that can enable a diffusion model to control the structure and appearance properties of each object in the image.

Graph Transformer GANs for Graph-Constrained House Generation

no code implementations CVPR 2023 Hao Tang, Zhenyu Zhang, Humphrey Shi, Bo Li, Ling Shao, Nicu Sebe, Radu Timofte, Luc van Gool

We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations in an end-to-end fashion for the challenging graph-constrained house generation task.

House Generation Node Classification

MI-GAN: A Simple Baseline for Image Inpainting on Mobile Devices

1 code implementation ICCV 2023 Andranik Sargsyan, Shant Navasardyan, Xingqian Xu, Humphrey Shi

In this paper we present a simple image inpainting baseline, Mobile Inpainting GAN (MI-GAN), which is approximately one order of magnitude computationally cheaper and smaller than existing state-of-the-art inpainting models, and can be efficiently deployed on mobile devices.

Efficient Neural Network Image Inpainting +1

Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models To Learn Any Unseen Style

no code implementations CVPR 2023 Haoming Lu, Hazarapet Tunanyan, Kai Wang, Shant Navasardyan, Zhangyang Wang, Humphrey Shi

Diffusion models have demonstrated impressive capability of text-conditioned image synthesis, and broader application horizons are emerging by personalizing those pretrained diffusion models toward generating some specialized target object or style.

Disentanglement Image Generation

Mask Matching Transformer for Few-Shot Segmentation

1 code implementation5 Dec 2022 Siyu Jiao, Gengwei Zhang, Shant Navasardyan, Ling Chen, Yao Zhao, Yunchao Wei, Humphrey Shi

Typical methods follow the paradigm to firstly learn prototypical features from support images and then match query features in pixel-level to obtain segmentation results.

Few-Shot Semantic Segmentation Segmentation

Boosted Dynamic Neural Networks

1 code implementation30 Nov 2022 Haichao Yu, Haoxiang Li, Gang Hua, Gao Huang, Humphrey Shi

To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data.

Versatile Diffusion: Text, Images and Variations All in One Diffusion Model

3 code implementations ICCV 2023 Xingqian Xu, Zhangyang Wang, Eric Zhang, Kai Wang, Humphrey Shi

In this work, we expand the existing single-flow diffusion pipeline into a multi-task multimodal network, dubbed Versatile Diffusion (VD), that handles multiple flows of text-to-image, image-to-text, and variations in one unified model.

Disentanglement Image Captioning +5

OneFormer: One Transformer to Rule Universal Image Segmentation

2 code implementations CVPR 2023 Jitesh Jain, Jiachen Li, MangTik Chiu, Ali Hassani, Nikita Orlov, Humphrey Shi

However, such panoptic architectures do not truly unify image segmentation because they need to be trained individually on the semantic, instance, or panoptic segmentation to achieve the best performance.

Instance Segmentation Panoptic Segmentation +3

StyleNAT: Giving Each Head a New Perspective

2 code implementations10 Nov 2022 Steven Walton, Ali Hassani, Xingqian Xu, Zhangyang Wang, Humphrey Shi

Image generation has been a long sought-after but challenging task, and performing the generation task in an efficient manner is similarly difficult.

Face Generation

Image Completion with Heterogeneously Filtered Spectral Hints

1 code implementation7 Nov 2022 Xingqian Xu, Shant Navasardyan, Vahram Tadevosyan, Andranik Sargsyan, Yadong Mu, Humphrey Shi

We also prove the effectiveness of our design via ablation studies, from which one may notice that the aforementioned challenges, i. e. pattern unawareness, blurry textures, and structure distortion, can be noticeably resolved.

Image Inpainting

Dilated Neighborhood Attention Transformer

4 code implementations29 Sep 2022 Ali Hassani, Humphrey Shi

These models typically employ localized attention mechanisms, such as the sliding-window Neighborhood Attention (NA) or Swin Transformer's Shifted Window Self Attention.

Image Classification Instance Segmentation +3

AdaFocusV3: On Unified Spatial-temporal Dynamic Video Recognition

no code implementations27 Sep 2022 Yulin Wang, Yang Yue, Xinhong Xu, Ali Hassani, Victor Kulikov, Nikita Orlov, Shiji Song, Humphrey Shi, Gao Huang

Recent research has revealed that reducing the temporal and spatial redundancy are both effective approaches towards efficient video recognition, e. g., allocating the majority of computation to a task-relevant subset of frames or the most valuable image regions of each frame.

Video Recognition

VMFormer: End-to-End Video Matting with Transformer

1 code implementation26 Aug 2022 Jiachen Li, Vidit Goel, Marianna Ohanyan, Shant Navasardyan, Yunchao Wei, Humphrey Shi

In this paper, we propose VMFormer: a transformer-based end-to-end method for video matting.

Video Matting

Keys to Better Image Inpainting: Structure and Texture Go Hand in Hand

1 code implementation5 Aug 2022 Jitesh Jain, Yuqian Zhou, Ning Yu, Humphrey Shi

We claim that the performance of inpainting algorithms can be better judged by the generated structures and textures.

Image Inpainting Texture Synthesis

DiSparse: Disentangled Sparsification for Multitask Model Compression

1 code implementation CVPR 2022 Xinglong Sun, Ali Hassani, Zhangyang Wang, Gao Huang, Humphrey Shi

We analyzed the pruning masks generated with DiSparse and observed strikingly similar sparse network architecture identified by each task even before the training starts.

Model Compression

Towards Layer-wise Image Vectorization

1 code implementation CVPR 2022 Xu Ma, Yuqian Zhou, Xingqian Xu, Bin Sun, Valerii Filev, Nikita Orlov, Yun Fu, Humphrey Shi

Image rasterization is a mature technique in computer graphics, while image vectorization, the reverse path of rasterization, remains a major challenge.

Grasping the Arrow of Time from the Singularity: Decoding Micromotion in Low-dimensional Latent Spaces from StyleGAN

1 code implementation27 Apr 2022 Qiucheng Wu, Yifan Jiang, Junru Wu, Kai Wang, Gong Zhang, Humphrey Shi, Zhangyang Wang, Shiyu Chang

To study the motion features in the latent space of StyleGAN, in this paper, we hypothesize and demonstrate that a series of meaningful, natural, and versatile small, local movements (referred to as "micromotion", such as expression, head movement, and aging effect) can be represented in low-rank spaces extracted from the latent space of a conventionally pre-trained StyleGAN-v2 model for face generation, with the guidance of proper "anchors" in the form of either short text or video clips.

Disentanglement Face Generation

Neighborhood Attention Transformer

5 code implementations CVPR 2023 Ali Hassani, Steven Walton, Jiachen Li, Shen Li, Humphrey Shi

We present Neighborhood Attention (NA), the first efficient and scalable sliding-window attention mechanism for vision.

Image Classification Object Detection +1

SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image

1 code implementation2 Apr 2022 Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Humphrey Shi, Zhangyang Wang

Despite the rapid development of Neural Radiance Field (NeRF), the necessity of dense covers largely prohibits its wider applications.

Novel View Synthesis

Object Localization under Single Coarse Point Supervision

2 code implementations CVPR 2022 Xuehui Yu, Pengfei Chen, Di wu, Najmul Hassan, Guorong Li, Junchi Yan, Humphrey Shi, Qixiang Ye, Zhenjun Han

In this study, we propose a POL method using coarse point annotations, relaxing the supervision signals from accurate key points to freely spotted points.

Multiple Instance Learning Object Localization

AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

1 code implementation CVPR 2022 Yulin Wang, Yang Yue, Yuanze Lin, Haojun Jiang, Zihang Lai, Victor Kulikov, Nikita Orlov, Humphrey Shi, Gao Huang

Recent works have shown that the computational efficiency of video recognition can be significantly improved by reducing the spatial redundancy.

Video Recognition

SeMask: Semantically Masked Transformers for Semantic Segmentation

1 code implementation arXiv 2021 Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi

To achieve this, we propose SeMask, a simple and effective framework that incorporates semantic information into the encoder with the help of a semantic attention operation.

Semantic Segmentation

More Control for Free! Image Synthesis with Semantic Diffusion Guidance

no code implementations10 Dec 2021 Xihui Liu, Dong Huk Park, Samaneh Azadi, Gong Zhang, Arman Chopikyan, Yuxiao Hu, Humphrey Shi, Anna Rohrbach, Trevor Darrell

We investigate fine-grained, continuous control of this model class, and introduce a novel unified framework for semantic diffusion guidance, which allows either language or image guidance, or both.

Continuous Control Denoising +1

Feudal Reinforcement Learning by Reading Manuals

no code implementations13 Oct 2021 Kai Wang, Zhonghao Wang, Mo Yu, Humphrey Shi

The manager agent is a multi-hop plan generator dealing with high-level abstract information and generating a series of sub-goals in a backward manner.

reinforcement-learning Reinforcement Learning (RL)

ConvMLP: Hierarchical Convolutional MLPs for Vision

4 code implementations9 Sep 2021 Jiachen Li, Ali Hassani, Steven Walton, Humphrey Shi

MLP-based architectures, which consist of a sequence of consecutive multi-layer perceptron blocks, have recently been found to reach comparable results to convolutional and transformer-based methods.

Ranked #8 on Image Classification on Flowers-102 (using extra training data)

Image Classification Instance Segmentation +3

Understanding and Accelerating Neural Architecture Search with Training-Free and Theory-Grounded Metrics

1 code implementation26 Aug 2021 Wuyang Chen, Xinyu Gong, Junru Wu, Yunchao Wei, Humphrey Shi, Zhicheng Yan, Yi Yang, Zhangyang Wang

This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS), with high performance, low cost, and in-depth interpretation.

Neural Architecture Search

MSN: Efficient Online Mask Selection Network for Video Instance Segmentation

1 code implementation19 Jun 2021 Vidit Goel, Jiachen Li, Shubhika Garg, Harsh Maheshwari, Humphrey Shi

Our method improves the masks from segmentation and propagation branches in an online manner using the Mask Selection Network (MSN) hence limiting the noise accumulation during mask tracking.

Instance Segmentation Segmentation +4

RSCA: Real-time Segmentation-based Context-Aware Scene Text Detection

no code implementations26 May 2021 Jiachen Li, Yuan Lin, Rongrong Liu, Chiu Man Ho, Humphrey Shi

Segmentation-based scene text detection methods have been widely adopted for arbitrary-shaped text detection recently, since they make accurate pixel-level predictions on curved text instances and can facilitate real-time inference without time-consuming processing on anchors.

Scene Text Detection Segmentation +1

Is In-Domain Data Really Needed? A Pilot Study on Cross-Domain Calibration for Network Quantization

no code implementations16 May 2021 Haichao Yu, Linjie Yang, Humphrey Shi

Post-training quantization methods use a set of calibration data to compute quantization ranges for network parameters and activations.


Pseudo-IoU: Improving Label Assignment in Anchor-Free Object Detection

1 code implementation29 Apr 2021 Jiachen Li, Bowen Cheng, Rogerio Feris, JinJun Xiong, Thomas S. Huang, Wen-mei Hwu, Humphrey Shi

Current anchor-free object detectors are quite simple and effective yet lack accurate label assignment methods, which limits their potential in competing with classic anchor-based models that are supported by well-designed assignment methods based on the Intersection-over-Union~(IoU) metric.

object-detection Object Detection

Escaping the Big Data Paradigm with Compact Transformers

7 code implementations12 Apr 2021 Ali Hassani, Steven Walton, Nikhil Shah, Abulikemu Abuduweili, Jiachen Li, Humphrey Shi

Our models are flexible in terms of model size, and can have as little as 0. 28M parameters while achieving competitive results.

 Ranked #1 on Image Classification on Flowers-102 (using extra training data)

Fine-Grained Image Classification Superpixel Image Classification

Learning to Track Instances without Video Annotations

no code implementations CVPR 2021 Yang Fu, Sifei Liu, Umar Iqbal, Shalini De Mello, Humphrey Shi, Jan Kautz

Tracking segmentation masks of multiple instances has been intensively studied, but still faces two fundamental challenges: 1) the requirement of large-scale, frame-wise annotation, and 2) the complexity of two-stage approaches.

Instance Segmentation Pose Estimation +1

UltraSR: Spatial Encoding is a Missing Key for Implicit Image Function-based Arbitrary-Scale Super-Resolution

1 code implementation23 Mar 2021 Xingqian Xu, Zhangyang Wang, Humphrey Shi

In this work, we propose UltraSR, a simple yet effective new network design based on implicit image functions in which we deeply integrated spatial coordinates and periodic encoding with the implicit neural representation.


Study Group Learning: Improving Retinal Vessel Segmentation Trained with Noisy Labels

1 code implementation5 Mar 2021 Yuqian Zhou, Hanchao Yu, Humphrey Shi

Retinal vessel segmentation from retinal images is an essential task for developing the computer-aided diagnosis system for retinal diseases.

Retinal Vessel Segmentation Segmentation

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

1 code implementation CVPR 2021 Abulikemu Abuduweili, Xingjian Li, Humphrey Shi, Cheng-Zhong Xu, Dejing Dou

To better exploit the value of both pre-trained weights and unlabeled target examples, we introduce adaptive consistency regularization that consists of two complementary components: Adaptive Knowledge Consistency (AKC) on the examples between the source and target model, and Adaptive Representation Consistency (ARC) on the target model between labeled and unlabeled examples.

Pseudo Label Transfer Learning

A Multi-Mode Modulator for Multi-Domain Few-Shot Classification

1 code implementation ICCV 2021 Yanbin Liu, Juho Lee, Linchao Zhu, Ling Chen, Humphrey Shi, Yi Yang

Most existing few-shot classification methods only consider generalization on one dataset (i. e., single-domain), failing to transfer across various seen and unseen domains.

Classification Domain Generalization

Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach

1 code implementation CVPR 2021 Xingqian Xu, Zhifei Zhang, Zhaowen Wang, Brian Price, Zhonghao Wang, Humphrey Shi

We also introduce Text Refinement Network (TexRNet), a novel text segmentation approach that adapts to the unique properties of text, e. g. non-convex boundary, diverse texture, etc., which often impose burdens on traditional segmentation models.

Segmentation Style Transfer +2

Human-Object Interaction Detection:A Quick Survey and Examination of Methods

1 code implementation27 Sep 2020 Trevor Bergstrom, Humphrey Shi

In order to provide insight to future researchers, we perform an individualized study that examines the performance of each component of a multi-stream convolutional neural network architecture for human-object interaction detection.

Human-Object Interaction Detection

Deep Learning for 3D Point Cloud Understanding: A Survey

1 code implementation18 Sep 2020 Haoming Lu, Humphrey Shi

The development of practical applications, such as autonomous driving and robotics, has brought increasing attention to 3D point cloud understanding.

Autonomous Driving

The 1st Tiny Object Detection Challenge:Methods and Results

1 code implementation16 Sep 2020 Xuehui Yu, Zhenjun Han, Yuqi Gong, Nan Jiang, Jian Zhao, Qixiang Ye, Jie Chen, Yuan Feng, Bin Zhang, Xiaodi Wang, Ying Xin, Jingwei Liu, Mingyuan Mao, Sheng Xu, Baochang Zhang, Shumin Han, Cheng Gao, Wei Tang, Lizuo Jin, Mingbo Hong, Yuchao Yang, Shuiwang Li, Huan Luo, Qijun Zhao, Humphrey Shi

The 1st Tiny Object Detection (TOD) Challenge aims to encourage research in developing novel and accurate methods for tiny object detection in images which have wide views, with a current focus on tiny person detection.

Human Detection object-detection +1

Motion Pyramid Networks for Accurate and Efficient Cardiac Motion Estimation

no code implementations28 Jun 2020 Hanchao Yu, Xiao Chen, Humphrey Shi, Terrence Chen, Thomas S. Huang, Shanhui Sun

In this paper, we propose Motion Pyramid Networks, a novel deep learning-based approach for accurate and efficient cardiac motion estimation.

Knowledge Distillation Motion Estimation

Image Super-Resolution with Cross-Scale Non-Local Attention and Exhaustive Self-Exemplars Mining

3 code implementations CVPR 2020 Yiqun Mei, Yuchen Fan, Yuqian Zhou, Lichao Huang, Thomas S. Huang, Humphrey Shi

By combining the new CS-NL prior with local and in-scale non-local priors in a powerful recurrent fusion cell, we can find more cross-scale feature correlations within a single low-resolution (LR) image.

Image Super-Resolution

Deep Learning-Based Automated Image Segmentation for Concrete Petrographic Analysis

no code implementations21 May 2020 Yu Song, Zilong Huang, Chuanyue Shen, Humphrey Shi, David A Lange

The standard petrography test method for measuring air voids in concrete (ASTM C457) requires a meticulous and long examination of sample phase composition under a stereomicroscope.

Image Segmentation Segmentation +2

Pyramid Attention Networks for Image Restoration

2 code implementations28 Apr 2020 Yiqun Mei, Yuchen Fan, Yulun Zhang, Jiahui Yu, Yuqian Zhou, Ding Liu, Yun Fu, Thomas S. Huang, Humphrey Shi

Self-similarity refers to the image prior widely used in image restoration algorithms that small but similar patterns tend to occur at different locations and scales.

Demosaicking Image Denoising +1

Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation

1 code implementation CVPR 2020 Zhonghao Wang, Mo Yu, Yunchao Wei, Rogerio Feris, JinJun Xiong, Wen-mei Hwu, Thomas S. Huang, Humphrey Shi

We consider the problem of unsupervised domain adaptation for semantic segmentation by easing the domain shift between the source domain (synthetic data) and the target domain (real data) in this work.

Semantic Segmentation Unsupervised Domain Adaptation

AlignSeg: Feature-Aligned Segmentation Networks

1 code implementation24 Feb 2020 Zilong Huang, Yunchao Wei, Xinggang Wang, Wenyu Liu, Thomas S. Huang, Humphrey Shi

Aggregating features in terms of different convolutional blocks or contextual embeddings has been proven to be an effective way to strengthen feature representations for semantic segmentation.

Segmentation Semantic Segmentation

CCNet: Criss-Cross Attention for Semantic Segmentation

4 code implementations ICCV 2019 Zilong Huang, Xinggang Wang, Yunchao Wei, Lichao Huang, Humphrey Shi, Wenyu Liu, Thomas S. Huang

Compared with the non-local block, the proposed recurrent criss-cross attention module requires 11x less GPU memory usage.

Ranked #7 on Semantic Segmentation on FoodSeg103 (using extra training data)

Human Parsing Instance Segmentation +8

A Simple Non-i.i.d. Sampling Approach for Efficient Training and Better Generalization

no code implementations23 Nov 2018 Bowen Cheng, Yunchao Wei, Jiahui Yu, Shiyu Chang, JinJun Xiong, Wen-mei Hwu, Thomas S. Huang, Humphrey Shi

While training on samples drawn from independent and identical distribution has been a de facto paradigm for optimizing image classification networks, humans learn new concepts in an easy-to-hard manner and on the selected examples progressively.

General Classification Image Classification +6

Cannot find the paper you are looking for? You can Submit a new open access paper.