no code implementations • 24 Mar 2025 • Xusheng Cao, Haori Lu, Linlan Huang, Fei Yang, Xialei Liu, Ming-Ming Cheng
Continual learning in computer vision faces the critical challenge of catastrophic forgetting, where models struggle to retain prior knowledge while adapting to new tasks.
no code implementations • 17 Mar 2025 • Xuying Zhang, Yupeng Zhou, Kai Wang, Yikai Wang, Zhen Li, Xiuli Shao, Daquan Zhou, Qibin Hou, Ming-Ming Cheng
Novel view synthesis (NVS) is a cornerstone for image-to-3d creation.
no code implementations • 4 Feb 2025 • Senmao Li, Kai Wang, Joost Van de Weijer, Fahad Shahbaz Khan, Chun-Le Guo, Shiqi Yang, Yaxing Wang, Jian Yang, Ming-Ming Cheng
Diffusion priors have been used for blind face restoration (BFR) by fine-tuning diffusion models (DMs) on restoration datasets to recover low-quality images.
1 code implementation • 23 Jan 2025 • Tao Liu, Kai Wang, Senmao Li, Joost Van de Weijer, Fahad Shahbaz Khan, Shiqi Yang, Yaxing Wang, Jian Yang, Ming-Ming Cheng
Drawing inspiration from the inherent context consistency, we propose a novel training-free method for consistent text-to-image (T2I) generation, termed "One-Prompt-One-Story" (1Prompt1Story).
2 code implementations • 8 Jan 2025 • Xin Zhang, Xue Yang, YuXuan Li, Jian Yang, Ming-Ming Cheng, Xiang Li
Our approach can effectively improve the performance of existing state-of-the-art weakly supervised methods and even surpasses fully supervised models on existing optical benchmarks (i. e., DOTA-v1. 0 dataset).
3 code implementations • 7 Jan 2025 • Xinbin Yuan, Zhaohui Zheng, YuXuan Li, Xialei Liu, Li Liu, Xiang Li, Qibin Hou, Ming-Ming Cheng
While witnessed with rapid development, remote sensing object detection remains challenging for detecting high aspect ratio objects.
Ranked #1 on
Object Detection In Aerial Images
on DOTA
(using extra training data)
1 code implementation • 30 Dec 2024 • YuXuan Li, Xiang Li, Yunheng Li, YiCheng Zhang, Yimian Dai, Qibin Hou, Ming-Ming Cheng, Jian Yang
To address these, we establish a benchmark dataset and propose a unified model, SM3Det (Single Model for Multi-Modal datasets and Multi-Task object Detection).
no code implementations • 22 Dec 2024 • Xuying Zhang, Yutong Liu, Yangguang Li, Renrui Zhang, Yufei Liu, Kai Wang, Wanli Ouyang, Zhiwei Xiong, Peng Gao, Qibin Hou, Ming-Ming Cheng
We present TAR3D, a novel framework that consists of a 3D-aware Vector Quantized-Variational AutoEncoder (VQ-VAE) and a Generative Pre-trained Transformer (GPT) to generate high-quality 3D assets.
1 code implementation • 16 Dec 2024 • Quan-Sheng Zeng, Yunheng Li, Daquan Zhou, Guanbin Li, Qibin Hou, Ming-Ming Cheng
Open-vocabulary image segmentation has been advanced through the synergy between mask generators and vision-language models like Contrastive Language-Image Pre-training (CLIP).
Ranked #1 on
Open Vocabulary Semantic Segmentation
on ADE20K-150
1 code implementation • 12 Dec 2024 • Zheng Li, Yibing Song, Penghai Zhao, Ming-Ming Cheng, Xiang Li, Jian Yang
Textual-based prompt learning methods primarily employ multiple learnable soft prompts and hard class tokens in a cascading manner as text prompt inputs, aiming to align image and text (category) spaces for downstream tasks.
no code implementations • 9 Dec 2024 • Yunheng Li, YuXuan Li, Quansheng Zeng, Wenhai Wang, Qibin Hou, Ming-Ming Cheng
Pre-trained vision-language models (VLMs), such as CLIP, have demonstrated impressive zero-shot recognition capability, but still underperform in dense prediction tasks.
no code implementations • 24 Nov 2024 • Zhong-Yu Li, Yu-Song Hu, Bo-Wen Yin, Ming-Ming Cheng
Ensemble learning has also succeeded in enhancing the performance and robustness of the vision models.
1 code implementation • 24 Nov 2024 • Zhong-Yu Li, Xin Jin, Boyuan Sun, Chun-Le Guo, Ming-Ming Cheng
We find that sRGB pre-training constrains the potential of RAW object detection due to the domain gap between sRGB and RAW, prompting us to directly pre-train on the RAW domain.
Ranked #1 on
Object Detection
on AODRaw
no code implementations • 24 Nov 2024 • Zhong-Yu Li, Yunheng Li, Deng-Ping Fan, Ming-Ming Cheng
One cost-saving strategy makes the decoder reconstruct only a subset of masked tokens and throw the others, and we refer to this method as partial reconstruction.
1 code implementation • 11 Nov 2024 • Taihang Hu, Linxuan Li, Joost Van de Weijer, Hongcheng Gao, Fahad Shahbaz Khan, Jian Yang, Ming-Ming Cheng, Kai Wang, Yaxing Wang
In this paper, we define semantic binding as the task of associating a given object with its attribute, termed attribute binding, or linking it to other related sub-objects, referred to as object binding.
2 code implementations • 29 Oct 2024 • Zhaochong An, Guolei Sun, Yun Liu, Runjia Li, Min Wu, Ming-Ming Cheng, Ender Konukoglu, Serge Belongie
Few-shot 3D point cloud segmentation (FS-PCS) aims at generalizing models to segment novel categories with minimal annotated support samples.
Few-shot 3D Point Cloud Semantic Segmentation
Point Cloud Segmentation
+1
no code implementations • 18 Oct 2024 • Yuhao Wan, Peng-Tao Jiang, Qibin Hou, Hao Zhang, Jinwei Chen, Ming-Ming Cheng, Bo Li
We show that the proper use of latent LR embeddings can produce higher-quality control signals, which enables the super-resolution results to be more consistent with the LR image and leads to clearer visual results.
1 code implementation • 14 Sep 2024 • Jiabao Wang, Zhaojiang Liu, Qiang Meng, Liujiang Yan, Ke Wang, Jie Yang, Wei Liu, Qibin Hou, Ming-Ming Cheng
Mainstream occupancy prediction works first discretize the 3D environment into voxels, then perform classification on such dense grids.
no code implementations • 7 Aug 2024 • Penghai Zhao, Qinghua Xing, Kairan Dou, Jinyu Tian, Ying Tai, Jian Yang, Ming-Ming Cheng, Xiang Li
As the academic landscape expands, the challenge of efficiently identifying impactful newly published articles grows increasingly vital.
no code implementations • 5 Jul 2024 • Jiabao Wang, Qiang Meng, Guochao Liu, Liujiang Yan, Ke Wang, Ming-Ming Cheng, Qibin Hou
In autonomous driving, the temporal stability of 3D object detection greatly impacts the driving safety.
1 code implementation • 2 Jun 2024 • Yunheng Li, Zhongyu Li, Quansheng Zeng, Qibin Hou, Ming-Ming Cheng
Our Cascade-CLIP is flexible and can be easily applied to existing zero-shot semantic segmentation methods.
1 code implementation • 2 May 2024 • Yupeng Zhou, Daquan Zhou, Ming-Ming Cheng, Jiashi Feng, Qibin Hou
This module converts the generated sequence of images into videos with smooth transitions and consistent subjects that are significantly more stable than the modules based on latent spaces only, especially in the context of long video generation.
1 code implementation • 27 Mar 2024 • Xusheng Cao, Haori Lu, Linlan Huang, Xialei Liu, Ming-Ming Cheng
In class-incremental learning (CIL) scenarios, the phenomenon of catastrophic forgetting caused by the classifier's bias towards the current task has long posed a significant challenge.
2 code implementations • 18 Mar 2024 • YuXuan Li, Xiang Li, Yimian Dai, Qibin Hou, Li Liu, Yongxiang Liu, Ming-Ming Cheng, Jian Yang
While a considerable amount of research has been dedicated to remote sensing classification, object detection and semantic segmentation, most of these studies have overlooked the valuable prior knowledge embedded within remote sensing scenarios.
Ranked #1 on
Semantic Segmentation
on ISPRS Vaihingen
1 code implementation • 15 Mar 2024 • Enguang Wang, Zhimao Peng, Zhengyuan Xie, Fei Yang, Xialei Liu, Ming-Ming Cheng
Specifically, our TES leverages the property that CLIP can generate aligned vision-language features, converting visual embeddings into tokens of the CLIP's text encoder to generate pseudo text embeddings.
1 code implementation • 11 Mar 2024 • YuXuan Li, Xiang Li, Weijie Li, Qibin Hou, Li Liu, Ming-Ming Cheng, Jian Yang
To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
Ranked #2 on
2D Object Detection
on SARDet-100K
(using extra training data)
1 code implementation • 4 Mar 2024 • Huali Xu, Li Liu, Shuaifeng Zhi, Shaojing Fu, Zhuo Su, Ming-Ming Cheng, Yongxiang Liu
For this reason, this paper explores a Source-Free CDFSL (SF-CDFSL) problem, in which CDFSL is addressed through the use of existing pretrained models instead of training a model with source data, avoiding accessing source data.
no code implementations • 27 Feb 2024 • XuanYi Li, Daquan Zhou, Chenxu Zhang, Shaodong Wei, Qibin Hou, Ming-Ming Cheng
We employ a method that transforms the generated videos into 3D models, leveraging the premise that the accuracy of 3D reconstruction is heavily contingent on the video quality.
no code implementations • 20 Feb 2024 • Penghai Zhao, Xin Zhang, Jiayue Cao, Ming-Ming Cheng, Jian Yang, Xiang Li
This paper presents a thorough analysis of these literature reviews within the PAMI field, and tries to address three core research questions: (1) What are the prevalent structural and statistical characteristics of PAMI literature reviews?
1 code implementation • CVPR 2024 • Xusheng Cao, Haori Lu, Linlan Huang, Xialei Liu, Ming-Ming Cheng
In class incremental learning (CIL) scenarios the phenomenon of catastrophic forgetting caused by the classifier's bias towards the current task has long posed a significant challenge.
1 code implementation • 20 Dec 2023 • Jiang-Tian Zhai, Xialei Liu, Lu Yu, Ming-Ming Cheng
Considering this challenge, we propose a novel framework of fine-grained knowledge selection and restoration.
1 code implementation • 15 Dec 2023 • Senmao Li, Taihang Hu, Joost Van de Weijer, Fahad Shahbaz Khan, Tao Liu, Linxuan Li, Shiqi Yang, Yaxing Wang, Ming-Ming Cheng, Jian Yang
This insight motivates us to omit encoder computation at certain adjacent time-steps and reuse encoder features of previous time-steps as input to the decoder in multiple time-steps.
1 code implementation • 12 Dec 2023 • Kangneng Zhou, Daiheng Gao, Xuan Wang, Jie Zhang, Peng Zhang, Xusen Sun, Longhao Zhang, Shiqi Yang, Bang Zhang, Liefeng Bo, Yaxing Wang, Ming-Ming Cheng
This enhances masked-based editing in local areas; second, we present a novel distillation strategy: Conditional Distillation on Geometry and Texture (CDGT).
1 code implementation • 10 Dec 2023 • Yunheng Li, Zhongyu Li, ShangHua Gao, Qilong Wang, Qibin Hou, Ming-Ming Cheng
Effectively modeling discriminative spatio-temporal information is essential for segmenting activities in long action sequences.
1 code implementation • CVPR 2024 • Zhen Li, Mingdeng Cao, Xintao Wang, Zhongang Qi, Ming-Ming Cheng, Ying Shan
Recent advances in text-to-image generation have made remarkable progress in synthesizing realistic human photos conditioned on given text prompts.
Ranked #6 on
Diffusion Personalization Tuning Free
on AgeDB
Diffusion Personalization Tuning Free
Text-to-Image Generation
no code implementations • CVPR 2024 • Xuying Zhang, Bo-Wen Yin, Yuming Chen, Zheng Lin, Yunheng Li, Qibin Hou, Ming-Ming Cheng
Particularly, a cross-modal graph is constructed to align the object points accurately and noun phrases decoupled from the 3D mesh and textual description.
no code implementations • 31 Oct 2023 • Xialei Liu, Xusheng Cao, Haori Lu, Jia-Wen Xiao, Andrew D. Bagdanov, Ming-Ming Cheng
We also propose a method for parameter retention in the adapter layers that uses a measure of parameter importance to better maintain stability and plasticity during incremental learning.
1 code implementation • 20 Oct 2023 • Zhaohui Zheng, Yuming Chen, Qibin Hou, Xiang Li, Ping Wang, Ming-Ming Cheng
A fundamental limitation of object detectors is that they suffer from "spatial bias", and in particular perform less satisfactorily when detecting objects near image borders.
1 code implementation • 8 Oct 2023 • Yu-Huan Wu, Shi-Chen Zhang, Yun Liu, Le Zhang, Xin Zhan, Daquan Zhou, Jiashi Feng, Ming-Ming Cheng, Liangli Zhen
Semantic segmentation tasks naturally require high-resolution information for pixel-wise segmentation and global context information for class prediction.
no code implementations • 8 Oct 2023 • Zhong-Yu Li, Bo-Wen Yin, Yongxiang Liu, Li Liu, Ming-Ming Cheng
Thus, we propose Heterogeneous Self-Supervised Learning (HSSL), which enforces a base model to learn from an auxiliary head whose architecture is heterogeneous from the base model.
1 code implementation • 18 Sep 2023 • Bowen Yin, Xuying Zhang, Zhongyu Li, Li Liu, Ming-Ming Cheng, Qibin Hou
We present DFormer, a novel RGB-D pretraining framework to learn transferable representations for RGB-D segmentation tasks.
Ranked #1 on
RGB-D Salient Object Detection
on DES
1 code implementation • ICCV 2023 • Jiang-Tian Zhai, Xialei Liu, Andrew D. Bagdanov, Ke Li, Ming-Ming Cheng
Moreover, MAEs can reliably reconstruct original input images from randomly selected patches, which we use to store exemplars from past tasks more efficiently for CIL.
1 code implementation • 10 Aug 2023 • Yuming Chen, Xinbin Yuan, Ruiqi Wu, Jiabao Wang, Qibin Hou, Ming-Ming Cheng
We aim at providing the object detection community with an efficient and performant object detector, termed YOLO-MS.
1 code implementation • ICCV 2023 • Xin Jin, Jia-Wen Xiao, Ling-Hao Han, Chunle Guo, Xialei Liu, Chongyi Li, Ming-Ming Cheng
However, these methods are impeded by several critical limitations: a) the explicit calibration process is both labor- and time-intensive, b) challenge exists in transferring denoisers across different camera models, and c) the disparity between synthetic and real noise is exacerbated by digital gain.
Ranked #1 on
Image Denoising
on SID SonyA7S2 x300
1 code implementation • 6 Jul 2023 • Yun Liu, Yu-Huan Wu, Shi-Chen Zhang, Li Liu, Min Wu, Ming-Ming Cheng
This dataset enables the training of sophisticated detectors for high-quality CTD.
1 code implementation • CVPR 2024 • Jiabao Wang, Yuming Chen, Zhaohui Zheng, Xiang Li, Ming-Ming Cheng, Qibin Hou
Moreover, as mimicking the teacher's predictions is the target of KD, CrossKD offers more task-oriented information in contrast with feature imitation.
1 code implementation • 13 Jun 2023 • Xuying Zhang, Bowen Yin, Zheng Lin, Qibin Hou, Deng-Ping Fan, Ming-Ming Cheng
We consider the problem of referring camouflaged object detection (Ref-COD), a new task that aims to segment specified camouflaged objects based on a small set of referring images with salient target objects.
1 code implementation • CVPR 2024 • Boyuan Sun, YuQi Yang, Le Zhang, Ming-Ming Cheng, Qibin Hou
Motivated by these, we aim to improve the use efficiency of unlabeled data by designing two novel label propagation strategies.
no code implementations • 24 May 2023 • Cheng-Ze Lu, Xiaojie Jin, Qibin Hou, Jun Hao Liew, Ming-Ming Cheng, Jiashi Feng
The study reveals that: 1) MIM can be viewed as an effective method to improve the model capacity when the scale of the training data is relatively small; 2) Strong reconstruction targets can endow the models with increased capacities on downstream tasks; 3) MIM pre-training is data-agnostic under most scenarios, which means that the strategy of sampling pre-training data is non-critical.
no code implementations • CVPR 2023 • Ze-Xin Yin, Jiaxiong Qiu, Ming-Ming Cheng, Bo Ren
Existing Neural Radiance Fields (NeRF) methods suffer from the existence of reflective objects, often resulting in blurry or distorted rendering.
1 code implementation • CVPR 2023 • Guolei Sun, Xiaogang Cheng, Zhaochong An, Xiaokang Wang, Yun Liu, Deng-Ping Fan, Ming-Ming Cheng, Luc van Gool
We further advance the frontier of this field by systematically studying a new challenge named indiscernible object counting (IOC), the goal of which is to count objects that are blended with respect to their surroundings.
1 code implementation • 21 Apr 2023 • Deng-Ping Fan, Ge-Peng Ji, Peng Xu, Ming-Ming Cheng, Christos Sakaridis, Luc van Gool
Concealed scene understanding (CSU) is a hot computer vision topic aiming to perceive objects exhibiting camouflage.
3 code implementations • CVPR 2023 • Zhen Li, Zuo-Liang Zhu, Ling-Hao Han, Qibin Hou, Chun-Le Guo, Ming-Ming Cheng
It is based on two essential designs.
1 code implementation • CVPR 2023 • Jiaxiong Qiu, Peng-Tao Jiang, Yifan Zhu, Ze-Xin Yin, Ming-Ming Cheng, Bo Ren
To remedy this issue, we present a novel surface reconstruction framework, NeuS-HSR, based on implicit neural rendering.
no code implementations • 12 Apr 2023 • Ge-Peng Ji, Deng-Ping Fan, Peng Xu, Ming-Ming Cheng, BoWen Zhou, Luc van Gool
Segmenting anything is a ground-breaking step toward artificial general intelligence, and the Segment Anything Model (SAM) greatly fosters the foundation models for computer vision.
1 code implementation • 28 Mar 2023 • Senmao Li, Joost Van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang, Ming-Ming Cheng
A significant research effort is focused on exploiting the amazing capacities of pretrained diffusion models for the editing of images. They either finetune the model, or invert the image in the latent space of the pretrained model.
Ranked #9 on
Text-based Image Editing
on PIE-Bench
1 code implementation • ICCV 2023 • ShangHua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan
To solve this issue, we propose a Masked Diffusion Transformer (MDT) that introduces a mask latent modeling scheme to explicitly enhance the DPMs' ability to contextual relation learning among object semantic parts in an image.
Ranked #16 on
Image Generation
on ImageNet 256x256
1 code implementation • ICCV 2023 • Yupeng Zhou, Zhen Li, Chun-Le Guo, Li Liu, Ming-Ming Cheng, Qibin Hou
Without any bells and whistles, we show that our SRFormer achieves a 33. 86dB PSNR score on the Urban100 dataset, which is 0. 46dB higher than that of SwinIR but uses fewer parameters and computations.
1 code implementation • ICCV 2023 • YuXuan Li, Qibin Hou, Zhaohui Zheng, Ming-Ming Cheng, Jian Yang, Xiang Li
To the best of our knowledge, this is the first time that large and selective kernel mechanisms have been explored in the field of remote sensing object detection.
Ranked #2 on
Oriented Object Detection
on DOTA 1.0
1 code implementation • 14 Mar 2023 • Ziyue Zhu, Zhao Zhang, Zheng Lin, Xing Sun, Ming-Ming Cheng
Such irrelevant information in the co-representation interferes with its locating of co-salient objects.
1 code implementation • CVPR 2024 • Peng-Tao Jiang, YuQi Yang, Yang Cao, Qibin Hou, Ming-Ming Cheng, Chunhua Shen
To date, most existing datasets focus on autonomous driving scenes.
no code implementations • 2 Feb 2023 • Weimin Shi, Mingchen Zhuge, Dehong Gao, Zhong Zhou, Ming-Ming Cheng, Deng-Ping Fan
Daily images may convey abstract meanings that require us to memorize and infer profound information from them.
no code implementations • 15 Jan 2023 • Cheng-Ze Lu, Xiaojie Jin, Zhicheng Huang, Qibin Hou, Ming-Ming Cheng, Jiashi Feng
Contrastive Masked Autoencoder (CMAE), as a new self-supervised framework, has shown its potential of learning expressive feature representations in visual image recognition.
1 code implementation • 14 Jan 2023 • Zhaohui Zheng, Yuming Chen, Qibin Hou, Xiang Li, Ming-Ming Cheng
In this paper, we study the spatial disequilibrium problem of modern object detectors and propose to quantify this ``spatial bias'' by measuring the detection performance over zones.
no code implementations • ICCV 2023 • Jiang-Tian Zhai, Qi Zhang, Tong Wu, Xing-Yu Chen, Jiang-Jiang Liu, Ming-Ming Cheng
By aggregating vision-language information, the region filter selects key regions and the region adaptor updates their coordinates with text guidance.
no code implementations • CVPR 2023 • Jia-Wen Xiao, Chang-Bin Zhang, Jiekang Feng, Xialei Liu, Joost Van de Weijer, Ming-Ming Cheng
In our method, the model containing old knowledge is fused with the model retaining new knowledge in a dynamic fusion manner, strengthening the memory of old classes in ever-changing distributions.
class-incremental learning
Class-Incremental Semantic Segmentation
+2
1 code implementation • CVPR 2024 • Xialei Liu, Jiang-Tian Zhai, Andrew D. Bagdanov, Ke Li, Ming-Ming Cheng
EFCIL is of interest because it mitigates concerns about privacy and long-term storage of data, while at the same time alleviating the problem of catastrophic forgetting in incremental learning.
no code implementations • 28 Nov 2022 • Jiang-Tian Zhai, Qi Zhang, Tong Wu, Xing-Yu Chen, Jiang-Jiang Liu, Bo Ren, Ming-Ming Cheng
By aggregating cross-modal information, the region filter selects key regions and the region adaptor updates their coordinates with text guidance.
2 code implementations • 22 Nov 2022 • Qibin Hou, Cheng-Ze Lu, Ming-Ming Cheng, Jiashi Feng
This paper does not attempt to design a state-of-the-art method for visual recognition but investigates a more efficient way to make use of convolutions to encode spatial features.
1 code implementation • 20 Oct 2022 • ShangHua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan
In this work, we explore a sustainable SSL framework with two major challenges: i) learning a stronger new SSL model based on the existing pretrained SSL model, also called as "base" model, in a cost-friendly manner, ii) allowing the training of the new model to be compatible with various base models.
Ranked #1 on
Semantic Segmentation
on ImageNet-S
1 code implementation • 8 Oct 2022 • Shijie Li, Ming-Ming Cheng, Juergen Gall
The goal of semantic image synthesis is to generate photo-realistic images from semantic label maps.
1 code implementation • 1 Oct 2022 • Xialei Liu, Yu-Song Hu, Xu-Sheng Cao, Andrew D. Bagdanov, Ke Li, Ming-Ming Cheng
However, conventional CIL methods consider a balanced distribution for each new task, which ignores the prevalence of long-tailed distributions in the real world.
3 code implementations • 18 Sep 2022 • Meng-Hao Guo, Cheng-Ze Lu, Qibin Hou, ZhengNing Liu, Ming-Ming Cheng, Shi-Min Hu
Notably, SegNeXt outperforms EfficientNet-L2 w/ NAS-FPN and achieves 90. 6% mIoU on the Pascal VOC 2012 test leaderboard using only 1/10 parameters of it.
Ranked #1 on
Semantic Segmentation
on iSAID
no code implementations • 18 Aug 2022 • Yu-Huan Wu, Da Zhang, Le Zhang, Xin Zhan, Dengxin Dai, Yun Liu, Ming-Ming Cheng
Current efficient LiDAR-based detection frameworks are lacking in exploiting object relations, which naturally present in both spatial and temporal manners.
1 code implementation • 27 Jul 2022 • Zhicheng Huang, Xiaojie Jin, Chengze Lu, Qibin Hou, Ming-Ming Cheng, Dongmei Fu, Xiaohui Shen, Jiashi Feng
The momentum encoder, fed with the full images, enhances the feature discriminability via contrastive learning with its online counterpart.
1 code implementation • 21 Jul 2022 • Zuo-Liang Zhu, Zhen Li, Rui-Xun Zhang, Chun-Le Guo, Ming-Ming Cheng
Lighting is a determining factor in photography that affects the style, expression of emotion, and even quality of images.
no code implementations • 5 Jul 2022 • Hongzhi Huang, Yu Wang, QinGhua Hu, Ming-Ming Cheng
In this study, we propose a novel method, called Class-Specific Semantic Reconstruction (CSSR), that integrates the power of AE and prototype learning.
2 code implementations • 14 Jun 2022 • ShangHua Gao, Zhong-Yu Li, Qi Han, Ming-Ming Cheng, Liang Wang
Our search scheme exploits both global search to find the coarse combinations and local search to get the refined receptive field combinations further.
Ranked #2 on
Instance Segmentation
on COCO 2017 val
(AP metric)
1 code implementation • 10 Jun 2022 • Zhong-Yu Li, ShangHua Gao, Ming-Ming Cheng
Specifically, instead of conducting self-supervised learning solely on feature embeddings from multiple views, we utilize the feature self-relations, i. e., spatial/channel self-relations, for self-supervised learning.
Ranked #2 on
Semantic Segmentation
on ImageNet-S
1 code implementation • 13 May 2022 • YuChao Gu, Xintao Wang, Liangbin Xie, Chao Dong, Gen Li, Ying Shan, Ming-Ming Cheng
Equipped with the VQ codebook as a facial detail dictionary and the parallel decoder design, the proposed VQFR can largely enhance the restored quality of facial details while keeping the fidelity to previous methods.
1 code implementation • 12 Apr 2022 • Zhaohui Zheng, Rongguang Ye, Qibin Hou, Dongwei Ren, Ping Wang, WangMeng Zuo, Ming-Ming Cheng
Combining these two new components, for the first time, we show that logit mimicking can outperform feature imitation and the absence of localization distillation is a critical reason for why logit mimicking underperforms for years.
2 code implementations • CVPR 2022 • Zhen Li, Cheng-Ze Lu, Jianhua Qin, Chun-Le Guo, Ming-Ming Cheng
Optical flow, which captures motion information across frames, is exploited in recent video inpainting methods through propagating pixels along its trajectories.
Ranked #2 on
Seeing Beyond the Visible
on KITTI360-EX
no code implementations • 25 Mar 2022 • Zheng Lin, Zhao Zhang, Kang-Rui Zhang, Bo Ren, Ming-Ming Cheng
Our IST method can serve as a brush, dip style from anywhere, and then paint to any region of the target content image.
1 code implementation • CVPR 2022 • Chang-Bin Zhang, Jia-Wen Xiao, Xialei Liu, Ying-Cong Chen, Ming-Ming Cheng
In this work, we study the continual semantic segmentation problem, where the deep neural networks are required to incorporate new classes continually without catastrophic forgetting.
Ranked #1 on
Domain 1-1
on Cityscapes
Class Incremental Learning
Continual Semantic Segmentation
+16
21 code implementations • 20 Feb 2022 • Meng-Hao Guo, Cheng-Ze Lu, Zheng-Ning Liu, Ming-Ming Cheng, Shi-Min Hu
In this paper, we propose a novel linear attention named large kernel attention (LKA) to enable self-adaptive and long-range correlations in self-attention while avoiding its shortcomings.
Ranked #1 on
Panoptic Segmentation
on COCO panoptic
no code implementations • 23 Jan 2022 • Ming-Ming Cheng, Peng-Tao Jiang, Ling-Hao Han, Liang Wang, Philip Torr
The proposed framework can generate a deep hierarchy of strongly associated supporting evidence for the network decision, which provides insight into the decision-making process.
2 code implementations • CVPR 2022 • Zheng Lin, Zheng-Peng Duan, Zhao Zhang, Chun-Le Guo, Ming-Ming Cheng
However, the global view makes the model lose focus from later clicks, and is not in line with user intentions.
Ranked #5 on
Interactive Segmentation
on SBD
no code implementations • 17 Dec 2021 • Dingwen Zhang, Wenyuan Zeng, Guangyu Guo, Chaowei Fang, Lechao Cheng, Ming-Ming Cheng, Junwei Han
Current weakly supervised semantic segmentation (WSSS) frameworks usually contain the separated mask-refinement model and the main semantic region mining model.
Knowledge Distillation
Weakly supervised Semantic Segmentation
+1
1 code implementation • 17 Dec 2021 • Guangyu Guo, Dingwen Zhang, Longfei Han, Nian Liu, Ming-Ming Cheng, Junwei Han
Then, a Teacher-Assistant-Student (TAS) framework is further established to disentangle pixel distillation into the model compression stage and input compression stage, which significantly reduces the overall complexity of pixel distillation and the difficulty of distilling intermediate knowledge.
1 code implementation • 15 Nov 2021 • Meng-Hao Guo, Tian-Xing Xu, Jiang-Jiang Liu, Zheng-Ning Liu, Peng-Tao Jiang, Tai-Jiang Mu, Song-Hai Zhang, Ralph R. Martin, Ming-Ming Cheng, Shi-Min Hu
Humans can naturally and effectively find salient regions in complex scenes.
1 code implementation • ICCV 2021 • Yu Zhang, Chang-Bin Zhang, Peng-Tao Jiang, Ming-Ming Cheng, Feng Mao
In this paper, we address the problem of personalized image segmentation.
3 code implementations • 23 Jun 2021 • Qibin Hou, Zihang Jiang, Li Yuan, Ming-Ming Cheng, Shuicheng Yan, Jiashi Feng
By realizing the importance of the positional information carried by 2D feature representations, unlike recent MLP-like models that encode the spatial information along the flattened spatial dimensions, Vision Permutator separately encodes the feature representations along the height and width dimensions with linear projections.
4 code implementations • 22 Jun 2021 • Yu-Huan Wu, Yun Liu, Xin Zhan, Ming-Ming Cheng
A popular solution to this problem is to use a single pooling operation to reduce the sequence length.
Ranked #7 on
RGB Salient Object Detection
on DUTS-TE
(max F-measure metric)
3 code implementations • IEEE 2021 • Peng-Tao Jiang, Chang-Bin Zhang, Qibin Hou, Ming-Ming Cheng, Yunchao Wei
To evaluate the quality of the class activation maps produced by LayerCAM, we apply them to weakly-supervised object localization and semantic segmentation.
no code implementations • CVPR 2021 • Shang-Hua Gao, Qi Han, Duo Li, Ming-Ming Cheng, Pai Peng
We propose to add a simple yet effective feature calibration scheme into the centering and scaling operations of BatchNorm, enhancing the instance-specific representations with the negligible computational cost.
1 code implementation • ICLR 2022 • Qi Han, Zejia Fan, Qi Dai, Lei Sun, Ming-Ming Cheng, Jiaying Liu, Jingdong Wang
Sparse connectivity: there is no connection across channels, and each position is connected to the positions within a small local window.
3 code implementations • 6 Jun 2021 • ShangHua Gao, Zhong-Yu Li, Ming-Hsuan Yang, Ming-Ming Cheng, Junwei Han, Philip Torr
In this work, we propose a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to help the research progress.
Ranked #1 on
Unsupervised Semantic Segmentation
on ImageNet-S-300
2 code implementations • 25 May 2021 • Jia-Wang Bian, Huangying Zhan, Naiyan Wang, Zhichao Li, Le Zhang, Chunhua Shen, Ming-Ming Cheng, Ian Reid
We propose a monocular depth estimator SC-Depth, which requires only unlabelled videos for training and enables the scale-consistent prediction at inference time.
2 code implementations • 7 May 2021 • Deng-Ping Fan, Jing Zhang, Gang Xu, Ming-Ming Cheng, Ling Shao
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
1 code implementation • CVPR 2021 • Gang Xu, Jun Xu, Zhen Li, Liang Wang, Xing Sun, Ming-Ming Cheng
To well exploit the temporal information, we propose a Locally-temporal Feature Comparison (LFC) module, along with the Bi-directional Deformable ConvLSTM, to extract short-term and long-term motion cues in videos.
3 code implementations • 21 Apr 2021 • Chongyi Li, Chunle Guo, Linghao Han, Jun Jiang, Ming-Ming Cheng, Jinwei Gu, Chen Change Loy
Low-light image enhancement (LLIE) aims at improving the perception or interpretability of an image captured in an environment with poor illumination.
2 code implementations • CVPR 2022 • Zhaohui Zheng, Rongguang Ye, Ping Wang, Dongwei Ren, WangMeng Zuo, Qibin Hou, Ming-Ming Cheng
Previous KD methods for object detection mostly focus on imitating deep features within the imitation regions instead of mimicking classification logit due to its inefficiency in distilling localization information and trivial improvement.
1 code implementation • 20 Feb 2021 • Deng-Ping Fan, Ge-Peng Ji, Ming-Ming Cheng, Ling Shao
We present the first systematic study on concealed object detection (COD), which aims to identify objects that are "perfectly" embedded in their background.
Ranked #5 on
Camouflaged Object Segmentation
on CHAMELEON
Camouflaged Object Segmentation
Dichotomous Image Segmentation
+2
2 code implementations • CVPR 2021 • Shang-Hua Gao, Qi Han, Zhong-Yu Li, Pai Peng, Liang Wang, Ming-Ming Cheng
Our search scheme exploits both global search to find the coarse combinations and local search to get the refined receptive field combination patterns further.
Ranked #16 on
Action Segmentation
on Breakfast
no code implementations • ICCV 2021 • Yu-Chao Gu, Shang-Hua Gao, Xu-Sheng Cao, Peng Du, Shao-Ping Lu, Ming-Ming Cheng
Existing salient object detection (SOD) models usually focus on either backbone feature extractors or saliency heads, ignoring their relations.
1 code implementation • 24 Dec 2020 • Yu-Huan Wu, Yun Liu, Le Zhang, Ming-Ming Cheng, Bo Ren
In this paper, we tap into this gap and show that enhancing high- level features is essential for SOD as well.
1 code implementation • 24 Dec 2020 • Yu-Huan Wu, Yun Liu, Jun Xu, Jia-Wang Bian, Yu-Chao Gu, Ming-Ming Cheng
Therefore, we propose an implicit depth restoration (IDR) technique to strengthen the mobile networks' feature representation capability for RGB-D SOD.
no code implementations • 21 Dec 2020 • Jiang-Jiang Liu, Zhi-Ang Liu, Ming-Ming Cheng
Our approach can cooperate with various existing U-shape-based salient object detection methods by substituting the connections between the bottom-up and top-down pathways.
4 code implementations • NeurIPS 2020 • Wen-Da Jin, Jun Xu, Ming-Ming Cheng, Yi Zhang, Wei Guo
Intra-saliency and inter-saliency cues have been extensively studied for co-saliency detection (Co-SOD).
3 code implementations • 26 Nov 2020 • Qijian Zhang, Runmin Cong, Chongyi Li, Ming-Ming Cheng, Yuming Fang, Xiaochun Cao, Yao Zhao, Sam Kwong
Despite the remarkable advances in visual saliency analysis for natural scene images (NSIs), salient object detection (SOD) for optical remote sensing images (RSIs) still remains an open and challenging problem.
2 code implementations • 25 Nov 2020 • Chang-Bin Zhang, Peng-Tao Jiang, Qibin Hou, Yunchao Wei, Qi Han, Zhen Li, Ming-Ming Cheng
Experiments demonstrate that based on the same classification models, the proposed approach can effectively improve the classification performance on CIFAR-100, ImageNet, and fine-grained datasets.
no code implementations • 17 Oct 2020 • Yunchao Wei, Shuai Zheng, Ming-Ming Cheng, Hang Zhao, LiWei Wang, Errui Ding, Yi Yang, Antonio Torralba, Ting Liu, Guolei Sun, Wenguan Wang, Luc van Gool, Wonho Bae, Junhyug Noh, Jinhwan Seo, Gunhee Kim, Hao Zhao, Ming Lu, Anbang Yao, Yiwen Guo, Yurong Chen, Li Zhang, Chuangchuang Tan, Tao Ruan, Guanghua Gu, Shikui Wei, Yao Zhao, Mariia Dobko, Ostap Viniavskyi, Oles Dobosevych, Zhendong Wang, Zhenyuan Chen, Chen Gong, Huanqing Yan, Jun He
The purpose of the Learning from Imperfect Data (LID) workshop is to inspire and facilitate the research in developing novel approaches that would harness the imperfect data and improve the data-efficiency during training.
no code implementations • CVPR 2021 • Yu-Chao Gu, Li-Juan Wang, Yun Liu, Yi Yang, Yu-Huan Wu, Shao-Ping Lu, Ming-Ming Cheng
DARTS mainly focuses on the operation search and derives the cell topology from the operation weights.
1 code implementation • 10 Sep 2020 • Yun Liu, Yu-Huan Wu, Pei-Song Wen, Yu-Jun Shi, Yu Qiu, Ming-Ming Cheng
For each proposal, this MIL framework can simultaneously compute probability distributions and category-aware semantic features, with which we can formulate a large undirected graph.
Image-level Supervised Instance Segmentation
Multiple Instance Learning
+3
1 code implementation • 1 Sep 2020 • Yu-Chao Gu, Le Zhang, Yun Liu, Shao-Ping Lu, Ming-Ming Cheng
Recent generative methods formulate GZSL as a missing data problem, which mainly adopts GANs or VAEs to generate visual features for unseen classes.
1 code implementation • 28 Aug 2020 • Yu-Huan Wu, Yun Liu, Le Zhang, Wang Gao, Ming-Ming Cheng
Much of the recent efforts on salient object detection (SOD) have been devoted to producing accurate saliency maps without being aware of their instance labels.
9 code implementations • 1 Aug 2020 • Tao Zhou, Deng-Ping Fan, Ming-Ming Cheng, Jianbing Shen, Ling Shao
Further, considering that the light field can also provide depth maps, we review SOD models and popular benchmark datasets from this domain as well.
no code implementations • 10 Jul 2020 • Xiao-Chang Liu, Xuan-Yi Li, Ming-Ming Cheng, Peter Hall
Our contribution is to introduce a neural architecture that supports transfer of geometric style.
1 code implementation • 8 Jul 2020 • Xin-Yu Zhang, Taihong Xiao, HaoLin Jia, Ming-Ming Cheng, Ming-Hsuan Yang
In this work, we propose a simple yet effective meta-learning algorithm in semi-supervised learning.
2 code implementations • 7 Jul 2020 • Deng-Ping Fan, Tengpeng Li, Zheng Lin, Ge-Peng Ji, Dingwen Zhang, Ming-Ming Cheng, Huazhu Fu, Jianbing Shen
CoSOD is an emerging and rapidly growing extension of salient object detection (SOD), which aims to detect the co-occurring salient objects in a group of images.
Ranked #7 on
Co-Salient Object Detection
on CoCA
no code implementations • 3 Jul 2020 • Shipeng Fu, Zhen Li, Jun Xu, Ming-Ming Cheng, Zitao Liu, Xiaomin Yang
Knowledge distillation is a standard teacher-student learning framework to train a light-weight student network under the guidance of a well-trained large teacher network.
1 code implementation • 16 Jun 2020 • Shijie Li, Yazan Abu Farha, Yun Liu, Ming-Ming Cheng, Juergen Gall
Despite the capabilities of these approaches in capturing temporal dependencies, their predictions suffer from over-segmentation errors.
Ranked #6 on
Action Segmentation
on Assembly101
no code implementations • 6 May 2020 • Kai Zhao, Xin-Yu Zhang, Qi Han, Ming-Ming Cheng
Convolutional neural networks (CNNs) are typically over-parameterized, bringing considerable computational overhead and memory footprint in inference.
1 code implementation • ECCV 2020 • Zhao Zhang, Wenda Jin, Jun Xu, Ming-Ming Cheng
Co-saliency detection (Co-SOD) aims to segment the common salient foreground in a group of relevant images.
Ranked #7 on
Co-Salient Object Detection
on CoSOD3k
1 code implementation • 23 Apr 2020 • Ying-Jun Du, Jun Xu, Xian-Tong Zhen, Ming-Ming Cheng, Ling Shao
In this paper, we propose a Conditional Variational Image Deraining (CVID) network for better deraining performance, leveraging the exclusive generative ability of Conditional Variational Auto-Encoder (CVAE) on providing diverse predictions for the rainy image.
no code implementations • 18 Apr 2020 • Jiang-Jiang Liu, Qibin Hou, Ming-Ming Cheng
To evaluate the performance of our proposed network on these tasks, we conduct exhaustive experiments on multiple representative datasets.
1 code implementation • 15 Apr 2020 • Yu-Huan Wu, Shang-Hua Gao, Jie Mei, Jun Xu, Deng-Ping Fan, Rong-Guo Zhang, Ming-Ming Cheng
The chest CT scan test provides a valuable complementary tool to the RT-PCR test, and it can identify the patients in the early-stage with high sensitivity.
1 code implementation • 9 Apr 2020 • Lin-Zhuo Chen, Zheng Lin, Ziqin Wang, Yong-Liang Yang, Ming-Ming Cheng
S-Conv is competent to infer the sampling offset of the convolution kernel guided by the 3D spatial information, helping the convolutional layer adjust the receptive field and adapt to geometric transformations.
Ranked #23 on
Semantic Segmentation
on SUN-RGBD
(using extra training data)
2 code implementations • CVPR 2020 • Qibin Hou, Li Zhang, Ming-Ming Cheng, Jiashi Feng
Spatial pooling has been proven highly effective in capturing long-range contextual information for pixel-wise prediction tasks, such as scene parsing.
Ranked #32 on
Semantic Segmentation
on Cityscapes test
1 code implementation • ECCV 2020 • Shang-Hua Gao, Yong-Qiang Tan, Ming-Ming Cheng, Chengze Lu, Yunpeng Chen, Shuicheng Yan
Salient object detection models often demand a considerable amount of computation cost to make precise prediction for each pixel, making them hardly applicable on low-power devices.
2 code implementations • ECCV 2020 • Kai Zhao, Qi Han, Chang-Bin Zhang, Jun Xu, Ming-Ming Cheng
In addition to the proposed method, we design an evaluation metric to assess the quality of line detection and construct a large scale dataset for the line detection task.
Ranked #2 on
Line Detection
on NKL
1 code implementation • 19 Feb 2020 • Xin-Yu Zhang, Kai Zhao, Taihong Xiao, Ming-Ming Cheng, Ming-Hsuan Yang
Recent advances in convolutional neural networks(CNNs) usually come with the expense of excessive computational overhead and memory footprint.
no code implementations • 24 Dec 2019 • Le Zhang, Zenglin Shi, Joey Tianyi Zhou, Ming-Ming Cheng, Yun Liu, Jia-Wang Bian, Zeng Zeng, Chunhua Shen
Specifically, with a diagnostic analysis, we show that the recurrent structure may not be effective to learn temporal dependencies than what we expected and implicitly yields an orderless representation.
no code implementations • 27 Nov 2019 • Xin-Yu Zhang, Le Zhang, Zao-Yi Zheng, Yun Liu, Jia-Wang Bian, Ming-Ming Cheng
The effectiveness of the triplet loss heavily relies on the triplet selection, in which a common practice is to first sample intra-class patches (positives) from the dataset for batch construction and then mine in-batch negatives to form triplets.
no code implementations • 25 Sep 2019 • Yujun Shi, Benben Liao, Guangyong Chen, Yun Liu, Ming-Ming Cheng, Jiashi Feng
Then, we show by experiments that DNNs under standard training rely heavily on optimizing the non-robust component in achieving decent performance.
1 code implementation • ICCV 2019 • Chaohao Xie, Shaohui Liu, Chao Li, Ming-Ming Cheng, WangMeng Zuo, Xiao Liu, Shilei Wen, Errui Ding
Most convolutional network (CNN)-based inpainting methods adopt standard convolution to indistinguishably treat valid pixels and holes, making them limited in handling irregular holes and more likely to generate inpainting results with color discrepancy and blurriness.
Ranked #2 on
Image Inpainting
on Paris StreetView
2 code implementations • NeurIPS 2019 • Jia-Wang Bian, Zhichao Li, Naiyan Wang, Huangying Zhan, Chunhua Shen, Ming-Ming Cheng, Ian Reid
To the best of our knowledge, this is the first work to show that deep networks trained using unlabelled monocular videos can predict globally scale-consistent camera trajectories over a long video sequence.
Ranked #4 on
Camera Pose Estimation
on KITTI Odometry Benchmark
no code implementations • 26 Aug 2019 • Jia-Wang Bian, Yu-Huan Wu, Ji Zhao, Yun Liu, Le Zhang, Ming-Ming Cheng, Ian Reid
According to this, we propose three high-quality matching systems and a Coarse-to-Fine RANSAC estimator.
no code implementations • 24 Aug 2019 • Le Zhang, Zenglin Shi, Ming-Ming Cheng, Yun Liu, Jia-Wang Bian, Joey Tianyi Zhou, Guoyan Zheng, Zeng Zeng
Nonlinear regression has been extensively employed in many computer vision problems (e. g., crowd counting, age estimation, affective computing).
3 code implementations • 22 Aug 2019 • Jia-Xing Zhao, Jiang-Jiang Liu, Den-Ping Fan, Yang Cao, Jufeng Yang, Ming-Ming Cheng
In the second step, we integrate the local edge information and global location information to obtain the salient edge features.
1 code implementation • ICCV 2019 • Deng-Ping Fan, Shengchuan Zhang, Yu-Huan Wu, Yun Liu, Ming-Ming Cheng, Bo Ren, Paul L. Rosin, Rongrong Ji
In this paper, we design a perceptual metric, called Structure Co-Occurrence Texture (Scoot), which simultaneously considers the block-level spatial structure and co-occurrence texture statistics.
1 code implementation • 18 Aug 2019 • Jinshan Pan, Yang Liu, Deqing Sun, Jimmy Ren, Ming-Ming Cheng, Jian Yang, Jinhui Tang
We present a simple and effective image super-resolution algorithm that imposes an image formation constraint on the deep neural networks via pixel substitution.
2 code implementations • 15 Jul 2019 • Deng-Ping Fan, Zheng Lin, Jia-Xing Zhao, Yun Liu, Zhao Zhang, Qibin Hou, Menglong Zhu, Ming-Ming Cheng
The use of RGB-D information for salient object detection has been extensively explored in recent years.
Ranked #4 on
RGB-D Salient Object Detection
on RGBD135
1 code implementation • 17 Jun 2019 • Jun Xu, Yuan Huang, Ming-Ming Cheng, Li Liu, Fan Zhu, Zhou Xu, Ling Shao
A simple but useful observation on our NAC is: as long as the noise is weak, it is feasible to learn a self-supervised network only with the corrupted image, approximating the optimal parameters of a supervised network learned with pairs of noisy and clean images.
no code implementations • 6 Jun 2019 • Yujun Shi, Benben Liao, Guangyong Chen, Yun Liu, Ming-Ming Cheng, Jiashi Feng
Despite many previous works studying the reason behind such adversarial behavior, the relationship between the generalization performance and adversarial behavior of DNNs is still little understood.
1 code implementation • 14 May 2019 • Lin-Zhuo Chen, Xuan-Yi Li, Deng-Ping Fan, Kai Wang, Shao-Ping Lu, Ming-Ming Cheng
We design a novel Local Spatial Aware (LSA) layer, which can learn to generate Spatial Distribution Weights (SDWs) hierarchically based on the spatial relationship in local region for spatial independent operations, to establish the relationship between these operations and spatial distribution, thus capturing the local geometric structure sensitively. We further propose the LSANet, which is based on LSA layer, aggregating the spatial information with associated features in each layer of the network better in network design. The experiments show that our LSANet can achieve on par or better performance than the state-of-the-art methods when evaluating on the challenging benchmark datasets.
5 code implementations • CVPR 2019 • Jiang-Jiang Liu, Qibin Hou, Ming-Ming Cheng, Jiashi Feng, Jianmin Jiang
We further design a feature aggregation module (FAM) to make the coarse-level semantic information well fused with the fine-level features from the top-down pathway.
Ranked #1 on
RGB Salient Object Detection
on SOD
32 code implementations • 2 Apr 2019 • Shang-Hua Gao, Ming-Ming Cheng, Kai Zhao, Xin-Yu Zhang, Ming-Hsuan Yang, Philip Torr
We evaluate the Res2Net block on all these models and demonstrate consistent performance gains over baseline models on widely-used datasets, e. g., CIFAR-100 and ImageNet.
Ranked #2 on
Image Classification
on GasHisSDB
1 code implementation • 28 Mar 2019 • Yun Liu, Ming-Ming Cheng, Xin-Yu Zhang, Guang-Yu Nie, Meng Wang
Recent progress on salient object detection mainly aims at exploiting how to effectively integrate multi-scale convolutional features in convolutional neural networks (CNNs).
no code implementations • 23 Jan 2019 • Jie Liang, Jufeng Yang, Ming-Ming Cheng, Paul L. Rosin, Liang Wang
In this paper we propose a unified framework to simultaneously discover the number of clusters and group the data points into them using subspace clustering.
no code implementations • 28 Dec 2018 • Yun Liu, Yu Qiu, Le Zhang, Jia-Wang Bian, Guang-Yu Nie, Ming-Ming Cheng
In this paper, we observe that the contexts of a natural image can be well expressed by a high-to-low self-learning of side-output convolutional features.
no code implementations • NeurIPS 2018 • Qibin Hou, Peng-Tao Jiang, Yunchao Wei, Ming-Ming Cheng
To test the quality of the generated attention maps, we employ the mined object regions as heuristic cues for learning semantic segmentation models.
no code implementations • ECCV 2018 • Ruochen Fan, Qibin Hou, Ming-Ming Cheng, Gang Yu, Ralph R. Martin, Shi-Min Hu
We also combine our method with Mask R-CNN for instance segmentation, and demonstrated for the first time the ability of weakly supervised instance segmentation using only keyword annotations.
Ranked #6 on
Image-level Supervised Instance Segmentation
on COCO test-dev
(using extra training data)
no code implementations • 7 Aug 2018 • Jia-Wang Bian, Ruihan Yang, Yun Liu, Le Zhang, Ming-Ming Cheng, Ian Reid, WenHai Wu
This leads to a critical absence in this field that there is no standard datasets and evaluation metrics to evaluate different feature matchers fairly.
no code implementations • 1 Jul 2018 • Kai Zhao, Wei Shen, ShangHua Gao, Dandan Li, Ming-Ming Cheng
In natural images, the scales (thickness) of object skeletons may dramatically vary among objects and object parts.
1 code implementation • CVPR 2018 • Zenglin Shi, Le Zhang, Yun Liu, Xiaofeng Cao, Yangdong Ye, Ming-Ming Cheng, Guoyan Zheng
Deep convolutional networks (ConvNets) have achieved unprecedented performances on many computer vision tasks.
Ranked #9 on
Crowd Counting
on WorldExpo’10
2 code implementations • 26 May 2018 • Deng-Ping Fan, Cheng Gong, Yang Cao, Bo Ren, Ming-Ming Cheng, Ali Borji
The existing binary foreground map (FM) measures to address various types of errors in either pixel-wise or structural ways.
no code implementations • 19 May 2018 • Yun Liu, Yujun Shi, Jia-Wang Bian, Le Zhang, Ming-Ming Cheng, Jiashi Feng
Collecting sufficient annotated data is very expensive in many applications, especially for pixel-level prediction tasks such as semantic segmentation.
no code implementations • ICCV 2019 • Kai Zhao, Shang-Hua Gao, Wenguan Wang, Ming-Ming Cheng
By reformulating the standard F-measure we propose the relaxed F-measure which is differentiable w. r. t the posterior and can be easily appended to the back of CNNs as the loss function.
1 code implementation • 9 Apr 2018 • Yun Liu, Ming-Ming Cheng, Deng-Ping Fan, Le Zhang, Jiawang Bian, DaCheng Tao
Semantic edge detection (SED), which aims at jointly extracting edges as well as their category information, has far-reaching applications in domains such as semantic segmentation, object proposal generation, and object recognition.
1 code implementation • 9 Apr 2018 • Deng-Ping Fan, Shengchuan Zhang, Yu-Huan Wu, Ming-Ming Cheng, Bo Ren, Rongrong Ji, Paul L. Rosin
However, human perception of the similarity of two sketches will consider both structure and texture as essential factors and is not sensitive to slight ("pixel-level") mismatches.
no code implementations • 27 Mar 2018 • Qibin Hou, Ming-Ming Cheng, Jiang-Jiang Liu, Philip H. S. Torr
In this paper, we improve semantic segmentation by automatically learning from Flickr images associated with a particular keyword, without relying on any explicit user annotations, thus substantially alleviating the dependence on accurate annotations when compared to previous weakly supervised methods.
no code implementations • 27 Mar 2018 • Qibin Hou, Jiang-Jiang Liu, Ming-Ming Cheng, Ali Borji, Philip H. S. Torr
Although these tasks are inherently very different, we show that our unified approach performs very well on all of them and works far better than current single-purpose state-of-the-art methods.
no code implementations • ECCV 2018 • Deng-Ping Fan, Ming-Ming Cheng, Jiang-Jiang Liu, Shang-Hua Gao, Qibin Hou, Ali Borji
Our analysis identifies a serious design bias of existing SOD datasets which assumes that each image contains at least one clearly outstanding salient object in low clutter.
no code implementations • 9 Mar 2018 • Runmin Cong, Jianjun Lei, Huazhu Fu, Ming-Ming Cheng, Weisi Lin, Qingming Huang
With the acquisition technology development, more comprehensive information, such as depth cue, inter-image correspondence, or temporal relationship, is available to extend image saliency detection to RGBD saliency detection, co-saliency detection, or video saliency detection.
1 code implementation • CVPR 2018 • Wenguan Wang, Jianbing Shen, Fang Guo, Ming-Ming Cheng, Ali Borji
Existing video saliency datasets lack variety and generality of common dynamic scenes and fall short in covering challenging situations in unconstrained environments.
no code implementations • 5 Jan 2018 • Kai Zhao, Wei Shen, Shang-Hua Gao, Dandan Li, Ming-Ming Cheng
In natural images, the scales (thickness) of object skeletons may dramatically vary among objects and object parts, making object skeleton detection a challenging problem.
Ranked #2 on
Object Skeleton Detection
on SK-LARGE
1 code implementation • CVPR 2019 • Ruochen Fan, Ming-Ming Cheng, Qibin Hou, Tai-Jiang Mu, Jingdong Wang, Shi-Min Hu
Taking into account the category-independent property of each target, we design a single stage salient instance segmentation framework, with a novel segmentation branch.
no code implementations • 12 Sep 2017 • Jia-Wang Bian, Le Zhang, Yun Liu, Wen-Yan Lin, Ming-Ming Cheng, Ian D. Reid
To this end, we present a uniform benchmark with novel evaluation metrics and a large-scale dataset for evaluating the overall performance of image matching methods.
1 code implementation • ICCV 2017 • Deng-Ping Fan, Ming-Ming Cheng, Yun Liu, Tao Li, Ali Borji
Our new measure simultaneously evaluates region-aware and object-aware structural similarity between a SM and a GT map.
1 code implementation • CVPR 2017 • Jia-Wang Bian, Wen-Yan Lin, Yasuyuki Matsushita, Sai-Kit Yeung, Tan-Dat Nguyen, Ming-Ming Cheng
Incorporating smoothness constraints into feature matching is known to enable ultra-robust matching.
no code implementations • CVPR 2017 • Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, Shuicheng Yan
We investigate a principle way to progressively mine discriminative object regions using classification networks to address the weakly-supervised semantic segmentation problems.
no code implementations • 7 Dec 2016 • Qinbin Hou, Puneet Kumar Dokania, Daniela Massiceti, Yunchao Wei, Ming-Ming Cheng, Philip Torr
We focus on the following three aspects of EM: (i) initialization; (ii) latent posterior estimation (E-step) and (iii) the parameter update (M-step).
Weakly supervised Semantic Segmentation
Weakly-Supervised Semantic Segmentation
3 code implementations • CVPR 2017 • Yun Liu, Ming-Ming Cheng, Xiao-Wei Hu, Kai Wang, Xiang Bai
Using VGG16 network, we achieve \sArt results on several available datasets.
Ranked #5 on
Edge Detection
on BIPED
no code implementations • 6 Dec 2016 • Jia-Xing Zhao, Ren Bo, Qibin Hou, Ming-Ming Cheng, Paul L. Rosin
It also has drawbacks on convergence rate as a result of both the fixed search region and separately doing the assignment step and the update step.
4 code implementations • CVPR 2017 • Qibin Hou, Ming-Ming Cheng, Xiao-Wei Hu, Ali Borji, Zhuowen Tu, Philip Torr
Recent progress on saliency detection is substantial, benefiting mostly from the explosive development of Convolutional Neural Networks (CNNs).
Ranked #4 on
RGB Salient Object Detection
on SBU / SBU-Refine
no code implementations • 14 Nov 2015 • Ziming Zhang, Yun Liu, Xi Chen, Yanjun Zhu, Ming-Ming Cheng, Venkatesh Saligrama, Philip H. S. Torr
We propose a novel object proposal algorithm, BING++, which inherits the virtue of good computational efficiency of BING but significantly improves its proposal localization quality.
no code implementations • 13 Oct 2015 • Stuart Golodetz, Michael Sapienza, Julien P. C. Valentin, Vibhav Vineet, Ming-Ming Cheng, Anurag Arnab, Victor A. Prisacariu, Olaf Kähler, Carl Yuheng Ren, David W. Murray, Shahram Izadi, Philip H. S. Torr
We present an open-source, real-time implementation of SemanticPaint, a system for geometric reconstruction, object-class segmentation and learning of 3D scenes.
1 code implementation • 10 Sep 2015 • Yunchao Wei, Xiaodan Liang, Yunpeng Chen, Xiaohui Shen, Ming-Ming Cheng, Jiashi Feng, Yao Zhao, Shuicheng Yan
Then, a better network called Enhanced-DCNN is learned with supervision from the predicted segmentation masks of simple images based on the Initial-DCNN as well as the image-level annotations.
no code implementations • 5 Jan 2015 • Ali Borji, Ming-Ming Cheng, Huaizu Jiang, Jia Li
We extensively compare, qualitatively and quantitatively, 40 state-of-the-art models (28 salient object detection, 10 fixation prediction, 1 objectness, and 1 baseline) over 6 challenging datasets for the purpose of benchmarking salient object detection and segmentation methods.
no code implementations • 18 Nov 2014 • Ali Borji, Ming-Ming Cheng, Qibin Hou, Huaizu Jiang, Jia Li
Detecting and segmenting salient objects from natural scenes, often referred to as salient object detection, has attracted great interest in computer vision.
no code implementations • CVPR 2013 • Huaizu Jiang, Zejian yuan, Ming-Ming Cheng, Yihong Gong, Nanning Zheng, Jingdong Wang
Our method, which is based on multi-level image segmentation, utilizes the supervised learning approach to map the regional feature vector to a saliency score.
no code implementations • CVPR 2014 • Ming-Ming Cheng, Ziming Zhang, Wen-Yan Lin, Philip Torr
Training a generic objectness measure to produce a small set of candidate object windows, has been shown to speed up the classical sliding window object detection paradigm.
no code implementations • CVPR 2014 • Shuai Zheng, Ming-Ming Cheng, Jonathan Warrell, Paul Sturgess, Vibhav Vineet, Carsten Rother, Philip H. S. Torr
The concepts of objects and attributes are both important for describing images precisely, since verbal descriptions often contain both adjectives and nouns (e. g. "I see a shiny red chair').
no code implementations • 16 Oct 2013 • Ming-Ming Cheng, Shuai Zheng, Wen-Yan Lin, Jonathan Warrell, Vibhav Vineet, Paul Sturgess, Nigel Crook, Niloy Mitra, Philip Torr
This allows us to formulate the image parsing problem as one of jointly estimating per-pixel object and attribute labels from a set of training images.
no code implementations • ACM Transactions on Graphics 2009 • Tao Chen, Ming-Ming Cheng, Ping Tan, Ariel Shamir, Shi-Min Hu
The composed picture is generated by seamlessly stitching several photographs in agreement with the sketch and text labels; these are found by searching the Internet.