1 code implementation • 29 Aug 2024 • Moreno D'Incà, Elia Peruzzo, Massimiliano Mancini, Xingqian Xu, Humphrey Shi, Nicu Sebe
OpenBias detects and quantifies biases, while GradBias determines the contribution of individual prompt words on biases.
1 code implementation • 28 Aug 2024 • Min Shi, Fuxiao Liu, Shihao Wang, Shijia Liao, Subhashree Radhakrishnan, De-An Huang, Hongxu Yin, Karan Sapra, Yaser Yacoob, Humphrey Shi, Bryan Catanzaro, Andrew Tao, Jan Kautz, Zhiding Yu, Guilin Liu
We discover that simply concatenating visual tokens from a set of complementary vision encoders is as effective as more complex mixing architectures or strategies.
1 code implementation • CVPR 2024 • Marianna Ohanyan, Hayk Manukyan, Zhangyang Wang, Shant Navasardyan, Humphrey Shi
We present Zero-Painter, a novel training-free framework for layout-conditional text-to-image synthesis that facilitates the creation of detailed and controlled imagery from textual prompts.
1 code implementation • 6 Jun 2024 • Jiayi Guo, Junhao Zhao, Chunjiang Ge, Chaoqun Du, Zanlin Ni, Shiji Song, Humphrey Shi, Gao Huang
To adapt the source model to the synthetic domain of the unconditional diffusion model, we introduce a Synthetic-Domain Alignment (SDA) framework to fine-tune the source model with synthetic data.
1 code implementation • 9 May 2024 • Jiachen Li, Xinyao Wang, Sijie Zhu, Chia-Wen Kuo, Lu Xu, Fan Chen, Jitesh Jain, Humphrey Shi, Longyin Wen
Recent advancements in Multimodal Large Language Models (LLMs) have focused primarily on scaling by increasing text-image pair data and enhancing LLMs to improve performance on multimodal tasks.
Ranked #1 on visual instruction following on LLaVA-Bench
1 code implementation • 22 Apr 2024 • Weijie Wang, Jichao Zhang, Chang Liu, Xia Li, Xingqian Xu, Humphrey Shi, Nicu Sebe, Bruno Lepri
To solve the above problems, we introduce a novel method, UVMap-ID, which is a controllable and personalized UV Map generative model.
1 code implementation • CVPR 2024 • Moreno D'Incà, Elia Peruzzo, Massimiliano Mancini, Dejia Xu, Vidit Goel, Xingqian Xu, Zhangyang Wang, Humphrey Shi, Nicu Sebe
In this paper, we tackle the challenge of open-set bias detection in text-to-image generative models presenting OpenBias, a new pipeline that identifies and quantifies the severity of biases agnostically, without access to any precompiled set.
1 code implementation • 30 Mar 2024 • Chenyi Zhang, Yihan Hu, Henghui Ding, Humphrey Shi, Yao Zhao, Yunchao Wei
Despite significant advancements in image matting, existing models heavily depend on manually-drawn trimaps for accurate results in natural image scenarios.
1 code implementation • 27 Mar 2024 • Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai
With these findings, we advocate using COCO-ReM for future object detection research.
1 code implementation • 21 Mar 2024 • Roberto Henschel, Levon Khachatryan, Daniil Hayrapetyan, Hayk Poghosyan, Vahram Tadevosyan, Zhangyang Wang, Shant Navasardyan, Humphrey Shi
To overcome these limitations, we introduce StreamingT2V, an autoregressive approach for long video generation of 80, 240, 600, 1200 or more frames with smooth transitions.
1 code implementation • 7 Mar 2024 • Ali Hassani, Wen-mei Hwu, Humphrey Shi
We observe that our fused kernels successfully circumvent some of the unavoidable inefficiencies in unfused implementations.
1 code implementation • 15 Feb 2024 • Arman Isajanyan, Artur Shatveryan, David Kocharyan, Zhangyang Wang, Humphrey Shi
These findings highlight the relevance and effectiveness of Social Reward in assessing community appreciation for AI-generated artworks, establishing a closer alignment with users' creative goals: creating popular visual art.
no code implementations • 4 Jan 2024 • Elia Peruzzo, Vidit Goel, Dejia Xu, Xingqian Xu, Yifan Jiang, Zhangyang Wang, Humphrey Shi, Nicu Sebe
Recently, several works tackled the video editing task fostered by the success of large-scale text-to-image generative models.
no code implementations • CVPR 2024 • Mang Tik Chiu, Yuqian Zhou, Lingzhi Zhang, Zhe Lin, Connelly Barnes, Sohrab Amirghodsi, Eli Shechtman, Humphrey Shi
Object inpainting is a task that involves adding objects to real images and seamlessly compositing them.
1 code implementation • CVPR 2024 • Vidit Goel, Elia Peruzzo, Yifan Jiang, Dejia Xu, Xingqian Xu, Nicu Sebe, Trevor Darrell, Zhangyang Wang, Humphrey Shi
We propose PAIR Diffusion a generic framework that enables a diffusion model to control the structure and appearance properties of each object in the image.
1 code implementation • CVPR 2024 • Jitesh Jain, Jianwei Yang, Humphrey Shi
Secondly, we leverage the images from COCO and outputs from off-the-shelf vision perception models to create our COCO Segmentation Text (COST) dataset for training and evaluating MLLMs on the object perception task.
1 code implementation • 21 Dec 2023 • Hayk Manukyan, Andranik Sargsyan, Barsegh Atanyan, Zhangyang Wang, Shant Navasardyan, Humphrey Shi
Recent progress in text-guided image inpainting, based on the unprecedented success of text-to-image diffusion models, has led to exceptionally realistic and visually plausible results.
1 code implementation • 10 Dec 2023 • Yihan Hu, Yiheng Lin, Wei Wang, Yao Zhao, Yunchao Wei, Humphrey Shi
However, the presence of high computational overhead and the inconsistency of noise sampling between the training and inference processes pose significant obstacles to achieving this goal.
Ranked #1 on Image Matting on Distinctions-646
1 code implementation • CVPR 2024 • Jiayi Guo, Xingqian Xu, Yifan Pu, Zanlin Ni, Chaofei Wang, Manushree Vasu, Shiji Song, Gao Huang, Humphrey Shi
Specifically, we introduce Step-wise Variation Regularization to enforce the proportion between the variations of an arbitrary input latent and that of the output image is a constant at any diffusion training step.
no code implementations • 30 Nov 2023 • Zhonghao Wang, Wei Wei, Yang Zhao, Zhisheng Xiao, Mark Hasegawa-Johnson, Humphrey Shi, Tingbo Hou
We further extend our method to a novel image editing task: substituting the subject in an image through textual manipulations.
1 code implementation • 7 Nov 2023 • Jiachen Li, Roberto Henschel, Vidit Goel, Marianna Ohanyan, Shant Navasardyan, Humphrey Shi
To remedy this deficiency, we propose Video Instance Matting~(VIM), that is, estimating alpha mattes of each instance at each frame of a video sequence.
no code implementations • 16 Oct 2023 • Chao Liang, Linchao Zhu, Humphrey Shi, Yi Yang
Sample selection is an effective way to deal with label noise.
no code implementations • 11 Oct 2023 • Hazarapet Tunanyan, Dejia Xu, Shant Navasardyan, Zhangyang Wang, Humphrey Shi
To achieve this goal, we identify the limitations in the text embeddings used for the pre-trained text-to-image diffusion models.
2 code implementations • NeurIPS 2023 • Siyu Jiao, Yunchao Wei, YaoWei Wang, Yao Zhao, Humphrey Shi
However, in the paper, we reveal that CLIP is insensitive to different mask proposals and tends to produce similar predictions for various mask proposals of the same image.
Open Vocabulary Semantic Segmentation Zero Shot Segmentation
no code implementations • 31 Jul 2023 • Elia Peruzzo, Willi Menapace, Vidit Goel, Federica Arrigoni, Hao Tang, Xingqian Xu, Arman Chopikyan, Nikita Orlov, Yuxiao Hu, Humphrey Shi, Nicu Sebe, Elisa Ricci
This paper advances the state of the art in this emerging research domain by proposing the first approach for Interactive NP.
no code implementations • 20 Jul 2023 • Dejia Xu, Xingqian Xu, Wenyan Cong, Humphrey Shi, Zhangyang Wang
We propose Reference-based Painterly Inpainting, a novel task that crosses the wild reference domain gap and implants novel objects into artworks.
no code implementations • 29 Jun 2023 • Feng Liu, Ryan Ashbaugh, Nicholas Chimitt, Najmul Hassan, Ali Hassani, Ajay Jaiswal, Minchul Kim, Zhiyuan Mao, Christopher Perry, Zhiyuan Ren, Yiyang Su, Pegah Varghaei, Kai Wang, Xingguang Zhang, Stanley Chan, Arun Ross, Humphrey Shi, Zhangyang Wang, Anil Jain, Xiaoming Liu
Whole-body biometric recognition is an important area of research due to its vast applications in law enforcement, border security, and surveillance.
1 code implementation • 8 Jun 2023 • Jiachen Li, Jitesh Jain, Humphrey Shi
In this paper, we propose the Matting Anything Model (MAM), an efficient and versatile framework for estimating the alpha matte of any instance in an image with flexible and interactive visual or linguistic user prompt guidance.
1 code implementation • CVPR 2024 • Xingqian Xu, Jiayi Guo, Zhangyang Wang, Gao Huang, Irfan Essa, Humphrey Shi
Text-to-image (T2I) research has grown explosively in the past year, owing to the large-scale pre-trained diffusion models and many emerging personalization and editing approaches.
1 code implementation • CVPR 2023 • Jiayi Guo, Chaofei Wang, You Wu, Eric Zhang, Kai Wang, Xingqian Xu, Shiji Song, Humphrey Shi, Gao Huang
Recently, CLIP-guided image synthesis has shown appealing performance on adapting a pre-trained source-domain generator to an unseen target domain.
1 code implementation • CVPR 2023 • Mang Tik Chiu, Xuaner Zhang, Zijun Wei, Yuqian Zhou, Eli Shechtman, Connelly Barnes, Zhe Lin, Florian Kainz, Sohrab Amirghodsi, Humphrey Shi
In this paper, we present an automatic wire clean-up system that eases the process of wire segmentation and removal/inpainting to within a few seconds.
1 code implementation • 30 Mar 2023 • Eric Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, Humphrey Shi
The unlearning problem of deep learning models, once primarily an academic concern, has become a prevalent issue in the industry.
1 code implementation • 30 Mar 2023 • Vidit Goel, Elia Peruzzo, Yifan Jiang, Dejia Xu, Xingqian Xu, Nicu Sebe, Trevor Darrell, Zhangyang Wang, Humphrey Shi
We propose PAIR Diffusion, a generic framework that can enable a diffusion model to control the structure and appearance properties of each object in the image.
1 code implementation • ICCV 2023 • Levon Khachatryan, Andranik Movsisyan, Vahram Tadevosyan, Roberto Henschel, Zhangyang Wang, Shant Navasardyan, Humphrey Shi
Recent text-to-video generation approaches rely on computationally heavy training and require large-scale video datasets.
no code implementations • CVPR 2023 • Hao Tang, Zhenyu Zhang, Humphrey Shi, Bo Li, Ling Shao, Nicu Sebe, Radu Timofte, Luc van Gool
We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations in an end-to-end fashion for the challenging graph-constrained house generation task.
1 code implementation • ICCV 2023 • Andranik Sargsyan, Shant Navasardyan, Xingqian Xu, Humphrey Shi
In this paper we present a simple image inpainting baseline, Mobile Inpainting GAN (MI-GAN), which is approximately one order of magnitude computationally cheaper and smaller than existing state-of-the-art inpainting models, and can be efficiently deployed on mobile devices.
no code implementations • CVPR 2023 • Haoming Lu, Hazarapet Tunanyan, Kai Wang, Shant Navasardyan, Zhangyang Wang, Humphrey Shi
Diffusion models have demonstrated impressive capability of text-conditioned image synthesis, and broader application horizons are emerging by personalizing those pretrained diffusion models toward generating some specialized target object or style.
1 code implementation • 5 Dec 2022 • Siyu Jiao, Gengwei Zhang, Shant Navasardyan, Ling Chen, Yao Zhao, Yunchao Wei, Humphrey Shi
Typical methods follow the paradigm to firstly learn prototypical features from support images and then match query features in pixel-level to obtain segmentation results.
1 code implementation • 30 Nov 2022 • Haichao Yu, Haoxiang Li, Gang Hua, Gao Huang, Humphrey Shi
To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data.
3 code implementations • ICCV 2023 • Xingqian Xu, Zhangyang Wang, Eric Zhang, Kai Wang, Humphrey Shi
In this work, we expand the existing single-flow diffusion pipeline into a multi-task multimodal network, dubbed Versatile Diffusion (VD), that handles multiple flows of text-to-image, image-to-text, and variations in one unified model.
3 code implementations • CVPR 2023 • Jitesh Jain, Jiachen Li, MangTik Chiu, Ali Hassani, Nikita Orlov, Humphrey Shi
However, such panoptic architectures do not truly unify image segmentation because they need to be trained individually on the semantic, instance, or panoptic segmentation to achieve the best performance.
Ranked #1 on Panoptic Segmentation on COCO minival
2 code implementations • 10 Nov 2022 • Steven Walton, Ali Hassani, Xingqian Xu, Zhangyang Wang, Humphrey Shi
Image generation has been a long sought-after but challenging task, and performing the generation task in an efficient manner is similarly difficult.
Ranked #2 on Image Generation on FFHQ 256 x 256
1 code implementation • 7 Nov 2022 • Xingqian Xu, Shant Navasardyan, Vahram Tadevosyan, Andranik Sargsyan, Yadong Mu, Humphrey Shi
We also prove the effectiveness of our design via ablation studies, from which one may notice that the aforementioned challenges, i. e. pattern unawareness, blurry textures, and structure distortion, can be noticeably resolved.
Ranked #1 on Image Inpainting on FFHQ 512 x 512
5 code implementations • 29 Sep 2022 • Ali Hassani, Humphrey Shi
These models typically employ localized attention mechanisms, such as the sliding-window Neighborhood Attention (NA) or Swin Transformer's Shifted Window Self Attention.
Ranked #4 on Panoptic Segmentation on COCO minival
no code implementations • 27 Sep 2022 • Yulin Wang, Yang Yue, Xinhong Xu, Ali Hassani, Victor Kulikov, Nikita Orlov, Shiji Song, Humphrey Shi, Gao Huang
Recent research has revealed that reducing the temporal and spatial redundancy are both effective approaches towards efficient video recognition, e. g., allocating the majority of computation to a task-relevant subset of frames or the most valuable image regions of each frame.
1 code implementation • 26 Aug 2022 • Jiachen Li, Vidit Goel, Marianna Ohanyan, Shant Navasardyan, Yunchao Wei, Humphrey Shi
In this paper, we propose VMFormer: a transformer-based end-to-end method for video matting.
1 code implementation • 5 Aug 2022 • Jitesh Jain, Yuqian Zhou, Ning Yu, Humphrey Shi
We claim that the performance of inpainting algorithms can be better judged by the generated structures and textures.
3 code implementations • 14 Jul 2022 • Pengfei Chen, Xuehui Yu, Xumeng Han, Najmul Hassan, Kai Wang, Jiachen Li, Jian Zhao, Humphrey Shi, Zhenjun Han, Qixiang Ye
However, the performance gap between point supervised object detection (PSOD) and bounding box supervised detection remains large.
1 code implementation • CVPR 2022 • Zeyuan Chen, Yinbo Chen, Jingwen Liu, Xingqian Xu, Vidit Goel, Zhangyang Wang, Humphrey Shi, Xiaolong Wang
The learned implicit neural representation can be decoded to videos of arbitrary spatial resolution and frame rate.
Space-time Video Super-resolution Video Frame Interpolation +1
1 code implementation • CVPR 2022 • Xinglong Sun, Ali Hassani, Zhangyang Wang, Gao Huang, Humphrey Shi
We analyzed the pruning masks generated with DiSparse and observed strikingly similar sparse network architecture identified by each task even before the training starts.
1 code implementation • CVPR 2022 • Xu Ma, Yuqian Zhou, Xingqian Xu, Bin Sun, Valerii Filev, Nikita Orlov, Yun Fu, Humphrey Shi
Image rasterization is a mature technique in computer graphics, while image vectorization, the reverse path of rasterization, remains a major challenge.
1 code implementation • 27 Apr 2022 • Qiucheng Wu, Yifan Jiang, Junru Wu, Kai Wang, Gong Zhang, Humphrey Shi, Zhangyang Wang, Shiyu Chang
To study the motion features in the latent space of StyleGAN, in this paper, we hypothesize and demonstrate that a series of meaningful, natural, and versatile small, local movements (referred to as "micromotion", such as expression, head movement, and aging effect) can be represented in low-rank spaces extracted from the latent space of a conventionally pre-trained StyleGAN-v2 model for face generation, with the guidance of proper "anchors" in the form of either short text or video clips.
5 code implementations • CVPR 2023 • Ali Hassani, Steven Walton, Jiachen Li, Shen Li, Humphrey Shi
We present Neighborhood Attention (NA), the first efficient and scalable sliding-window attention mechanism for vision.
Ranked #120 on Semantic Segmentation on ADE20K
1 code implementation • 2 Apr 2022 • Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Humphrey Shi, Zhangyang Wang
Despite the rapid development of Neural Radiance Field (NeRF), the necessity of dense covers largely prohibits its wider applications.
2 code implementations • CVPR 2022 • Xuehui Yu, Pengfei Chen, Di wu, Najmul Hassan, Guorong Li, Junchi Yan, Humphrey Shi, Qixiang Ye, Zhenjun Han
In this study, we propose a POL method using coarse point annotations, relaxing the supervision signals from accurate key points to freely spotted points.
1 code implementation • CVPR 2022 • Yulin Wang, Yang Yue, Yuanze Lin, Haojun Jiang, Zihang Lai, Victor Kulikov, Nikita Orlov, Humphrey Shi, Gao Huang
Recent works have shown that the computational efficiency of video recognition can be significantly improved by reducing the spatial redundancy.
1 code implementation • arXiv 2021 • Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi
To achieve this, we propose SeMask, a simple and effective framework that incorporates semantic information into the encoder with the help of a semantic attention operation.
Ranked #10 on Semantic Segmentation on Cityscapes val
no code implementations • 10 Dec 2021 • Xihui Liu, Dong Huk Park, Samaneh Azadi, Gong Zhang, Arman Chopikyan, Yuxiao Hu, Humphrey Shi, Anna Rohrbach, Trevor Darrell
We investigate fine-grained, continuous control of this model class, and introduce a novel unified framework for semantic diffusion guidance, which allows either language or image guidance, or both.
no code implementations • 9 Dec 2021 • Yifan Jiang, Xinyu Gong, Junru Wu, Humphrey Shi, Zhicheng Yan, Zhangyang Wang
Efficient video architecture is the key to deploying video recognition systems on devices with limited computing resources.
1 code implementation • 19 Nov 2021 • Guanglei Yang, Hao Tang, Humphrey Shi, Mingli Ding, Nicu Sebe, Radu Timofte, Luc van Gool, Elisa Ricci
The global alignment network aims to transfer the input image from the source domain to the target domain.
no code implementations • 13 Oct 2021 • Kai Wang, Zhonghao Wang, Mo Yu, Humphrey Shi
The manager agent is a multi-hop plan generator dealing with high-level abstract information and generating a series of sub-goals in a backward manner.
4 code implementations • 9 Sep 2021 • Jiachen Li, Ali Hassani, Steven Walton, Humphrey Shi
MLP-based architectures, which consist of a sequence of consecutive multi-layer perceptron blocks, have recently been found to reach comparable results to convolutional and transformer-based methods.
Ranked #7 on Image Classification on Flowers-102 (using extra training data)
1 code implementation • 26 Aug 2021 • Wuyang Chen, Xinyu Gong, Junru Wu, Yunchao Wei, Humphrey Shi, Zhicheng Yan, Yi Yang, Zhangyang Wang
This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS), with high performance, low cost, and in-depth interpretation.
1 code implementation • 19 Jun 2021 • Vidit Goel, Jiachen Li, Shubhika Garg, Harsh Maheshwari, Humphrey Shi
Our method improves the masks from segmentation and propagation branches in an online manner using the Mask Selection Network (MSN) hence limiting the noise accumulation during mask tracking.
Ranked #28 on Video Instance Segmentation on YouTube-VIS validation
no code implementations • 26 May 2021 • Jiachen Li, Yuan Lin, Rongrong Liu, Chiu Man Ho, Humphrey Shi
Segmentation-based scene text detection methods have been widely adopted for arbitrary-shaped text detection recently, since they make accurate pixel-level predictions on curved text instances and can facilitate real-time inference without time-consuming processing on anchors.
no code implementations • 16 May 2021 • Haichao Yu, Linjie Yang, Humphrey Shi
Post-training quantization methods use a set of calibration data to compute quantization ranges for network parameters and activations.
1 code implementation • 29 Apr 2021 • Jiachen Li, Bowen Cheng, Rogerio Feris, JinJun Xiong, Thomas S. Huang, Wen-mei Hwu, Humphrey Shi
Current anchor-free object detectors are quite simple and effective yet lack accurate label assignment methods, which limits their potential in competing with classic anchor-based models that are supported by well-designed assignment methods based on the Intersection-over-Union~(IoU) metric.
9 code implementations • 12 Apr 2021 • Ali Hassani, Steven Walton, Nikhil Shah, Abulikemu Abuduweili, Jiachen Li, Humphrey Shi
Our models are flexible in terms of model size, and can have as little as 0. 28M parameters while achieving competitive results.
Ranked #1 on Image Classification on Flowers-102 (using extra training data)
Fine-Grained Image Classification Superpixel Image Classification
no code implementations • CVPR 2021 • Yang Fu, Sifei Liu, Umar Iqbal, Shalini De Mello, Humphrey Shi, Jan Kautz
Tracking segmentation masks of multiple instances has been intensively studied, but still faces two fundamental challenges: 1) the requirement of large-scale, frame-wise annotation, and 2) the complexity of two-stage approaches.
1 code implementation • 23 Mar 2021 • Xingqian Xu, Zhangyang Wang, Humphrey Shi
In this work, we propose UltraSR, a simple yet effective new network design based on implicit image functions in which we deeply integrated spatial coordinates and periodic encoding with the implicit neural representation.
1 code implementation • 5 Mar 2021 • Yuqian Zhou, Hanchao Yu, Humphrey Shi
Retinal vessel segmentation from retinal images is an essential task for developing the computer-aided diagnosis system for retinal diseases.
Ranked #1 on Retinal Vessel Segmentation on CHASE_DB1
1 code implementation • CVPR 2021 • Abulikemu Abuduweili, Xingjian Li, Humphrey Shi, Cheng-Zhong Xu, Dejing Dou
To better exploit the value of both pre-trained weights and unlabeled target examples, we introduce adaptive consistency regularization that consists of two complementary components: Adaptive Knowledge Consistency (AKC) on the examples between the source and target model, and Adaptive Representation Consistency (ARC) on the target model between labeled and unlabeled examples.
1 code implementation • ICCV 2021 • Yanbin Liu, Juho Lee, Linchao Zhu, Ling Chen, Humphrey Shi, Yi Yang
Most existing few-shot classification methods only consider generalization on one dataset (i. e., single-domain), failing to transfer across various seen and unseen domains.
no code implementations • 17 Dec 2020 • Tiantu Xu, Kaiwen Shen, Yang Fu, Humphrey Shi, Felix Xiaozhu Lin
Object re-identification (ReID) is a key application of city-scale cameras.
1 code implementation • 7 Dec 2020 • Yang Fu, Linjie Yang, Ding Liu, Thomas S. Huang, Humphrey Shi
Video instance segmentation is a complex task in which we need to detect, segment, and track each object for any given video.
Ranked #43 on Video Instance Segmentation on YouTube-VIS validation
1 code implementation • CVPR 2021 • Xingqian Xu, Zhifei Zhang, Zhaowen Wang, Brian Price, Zhonghao Wang, Humphrey Shi
We also introduce Text Refinement Network (TexRNet), a novel text segmentation approach that adapts to the unique properties of text, e. g. non-convex boundary, diverse texture, etc., which often impose burdens on traditional segmentation models.
1 code implementation • ICCV 2021 • Zhonghao Wang, Kai Wang, Mo Yu, JinJun Xiong, Wen-mei Hwu, Mark Hasegawa-Johnson, Humphrey Shi
Finally, we achieve a higher level of interpretability by imposing OCCAM on the objects represented in the induced symbolic concept space.
Ranked #3 on Visual Question Answering (VQA) on CLEVR
1 code implementation • 27 Sep 2020 • Trevor Bergstrom, Humphrey Shi
In order to provide insight to future researchers, we perform an individualized study that examines the performance of each component of a multi-stream convolutional neural network architecture for human-object interaction detection.
1 code implementation • 18 Sep 2020 • Haoming Lu, Humphrey Shi
The development of practical applications, such as autonomous driving and robotics, has brought increasing attention to 3D point cloud understanding.
1 code implementation • 16 Sep 2020 • Xuehui Yu, Zhenjun Han, Yuqi Gong, Nan Jiang, Jian Zhao, Qixiang Ye, Jie Chen, Yuan Feng, Bin Zhang, Xiaodi Wang, Ying Xin, Jingwei Liu, Mingyuan Mao, Sheng Xu, Baochang Zhang, Shumin Han, Cheng Gao, Wei Tang, Lizuo Jin, Mingbo Hong, Yuchao Yang, Shuiwang Li, Huan Luo, Qijun Zhao, Humphrey Shi
The 1st Tiny Object Detection (TOD) Challenge aims to encourage research in developing novel and accurate methods for tiny object detection in images which have wide views, with a current focus on tiny person detection.
no code implementations • 14 Sep 2020 • Haichao Yu, Ning Xu, Zilong Huang, Yuqian Zhou, Humphrey Shi
Image matting is a key technique for image and video editing and composition.
no code implementations • 28 Jun 2020 • Hanchao Yu, Xiao Chen, Humphrey Shi, Terrence Chen, Thomas S. Huang, Shanhui Sun
In this paper, we propose Motion Pyramid Networks, a novel deep learning-based approach for accurate and efficient cardiac motion estimation.
3 code implementations • CVPR 2020 • Yiqun Mei, Yuchen Fan, Yuqian Zhou, Lichao Huang, Thomas S. Huang, Humphrey Shi
By combining the new CS-NL prior with local and in-scale non-local priors in a powerful recurrent fusion cell, we can find more cross-scale feature correlations within a single low-resolution (LR) image.
Ranked #10 on Image Super-Resolution on Manga109 - 3x upscaling
no code implementations • 21 May 2020 • Yu Song, Zilong Huang, Chuanyue Shen, Humphrey Shi, David A Lange
The standard petrography test method for measuring air voids in concrete (ASTM C457) requires a meticulous and long examination of sample phase composition under a stereomicroscope.
2 code implementations • 28 Apr 2020 • Yiqun Mei, Yuchen Fan, Yulun Zhang, Jiahui Yu, Yuqian Zhou, Ding Liu, Yun Fu, Thomas S. Huang, Humphrey Shi
Self-similarity refers to the image prior widely used in image restoration algorithms that small but similar patterns tend to occur at different locations and scales.
no code implementations • 2 Apr 2020 • Zhonghao Wang, Yunchao Wei, Rogerior Feris, JinJun Xiong, Wen-mei Hwu, Thomas S. Huang, Humphrey Shi
A key challenge of this task is how to alleviate the data distribution discrepancy between the source and target domains, i. e. reducing domain shift.
1 code implementation • CVPR 2020 • Zhonghao Wang, Mo Yu, Yunchao Wei, Rogerio Feris, JinJun Xiong, Wen-mei Hwu, Thomas S. Huang, Humphrey Shi
We consider the problem of unsupervised domain adaptation for semantic segmentation by easing the domain shift between the source domain (synthetic data) and the target domain (real data) in this work.
Ranked #8 on Semantic Segmentation on DensePASS
1 code implementation • 24 Feb 2020 • Zilong Huang, Yunchao Wei, Xinggang Wang, Wenyu Liu, Thomas S. Huang, Humphrey Shi
Aggregating features in terms of different convolutional blocks or contextual embeddings has been proven to be an effective way to strengthen feature representations for semantic segmentation.
4 code implementations • ICCV 2019 • Zilong Huang, Xinggang Wang, Yunchao Wei, Lichao Huang, Humphrey Shi, Wenyu Liu, Thomas S. Huang
Compared with the non-local block, the proposed recurrent criss-cross attention module requires 11x less GPU memory usage.
Ranked #7 on Semantic Segmentation on FoodSeg103 (using extra training data)
no code implementations • 23 Nov 2018 • Bowen Cheng, Yunchao Wei, Jiahui Yu, Shiyu Chang, JinJun Xiong, Wen-mei Hwu, Thomas S. Huang, Humphrey Shi
While training on samples drawn from independent and identical distribution has been a de facto paradigm for optimizing image classification networks, humans learn new concepts in an easy-to-hard manner and on the selected examples progressively.
3 code implementations • 5 Oct 2018 • Bowen Cheng, Yunchao Wei, Rogerio Feris, JinJun Xiong, Wen-mei Hwu, Thomas Huang, Humphrey Shi
In particular, DCR places a separate classification network in parallel with the localization network (base detector).