Search Results for author: Qibin Hou

Found 65 papers, 44 papers with code

Multi-Task Dense Prediction via Mixture of Low-Rank Experts

1 code implementation26 Mar 2024 YuQi Yang, Peng-Tao Jiang, Qibin Hou, Hao Zhang, Jinwei Chen, Bo Li

Furthermore, to control the parameters and computational cost brought by the increase in the number of experts, we take inspiration from LoRA and propose to leverage the low-rank format of a vanilla convolution in the expert network.

LSKNet: A Foundation Lightweight Backbone for Remote Sensing

1 code implementation18 Mar 2024 YuXuan Li, Xiang Li, Yimain Dai, Qibin Hou, Li Liu, Yongxiang Liu, Ming-Ming Cheng, Jian Yang

While a considerable amount of research has been dedicated to remote sensing classification, object detection and semantic segmentation, most of these studies have overlooked the valuable prior knowledge embedded within remote sensing scenarios.

object-detection Object Detection +1

SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection

1 code implementation11 Mar 2024 YuXuan Li, Xiang Li, Weijie Li, Qibin Hou, Li Liu, Ming-Ming Cheng, Jian Yang

To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.

 Ranked #1 on 2D Object Detection on SARDet-100K (using extra training data)

2k Object +2

Sora Generates Videos with Stunning Geometrical Consistency

no code implementations27 Feb 2024 XuanYi Li, Daquan Zhou, Chenxu Zhang, Shaodong Wei, Qibin Hou, Ming-Ming Cheng

We employ a method that transforms the generated videos into 3D models, leveraging the premise that the accuracy of 3D reconstruction is heavily contingent on the video quality.

3D Reconstruction Video Generation

Fast Window-Based Event Denoising with Spatiotemporal Correlation Enhancement

no code implementations14 Feb 2024 Huachen Fang, Jinjian Wu, Qibin Hou, Weisheng Dong, Guangming Shi

Previous deep learning-based event denoising methods mostly suffer from poor interpretability and difficulty in real-time processing due to their complex architecture designs.

Denoising

Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion Models

1 code implementation8 Feb 2024 Senmao Li, Joost Van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang

However, these models struggle to effectively suppress the generation of undesired content, which is explicitly requested to be omitted from the generated image in the prompt.

Polyper: Boundary Sensitive Polyp Segmentation

1 code implementation14 Dec 2023 Hao Shao, Yang Zhang, Qibin Hou

We present a new boundary sensitive framework for polyp segmentation, called Polyper.

Segmentation

MCANet: Medical Image Segmentation with Multi-Scale Cross-Axis Attention

1 code implementation14 Dec 2023 Hao Shao, Quansheng Zeng, Qibin Hou, Jufeng Yang

To process the significant variations of lesion regions or organs in individual sizes and shapes, we also use multiple convolutions of strip-shape kernels with different kernel sizes in each axial attention path to improve the efficiency of the proposed MCA in encoding spatial information.

Image Segmentation Lesion Segmentation +4

A Decoupled Spatio-Temporal Framework for Skeleton-based Action Segmentation

1 code implementation10 Dec 2023 Yunheng Li, Zhongyu Li, ShangHua Gao, Qilong Wang, Qibin Hou, Ming-Ming Cheng

Effectively modeling discriminative spatio-temporal information is essential for segmenting activities in long action sequences.

Action Segmentation

TeMO: Towards Text-Driven 3D Stylization for Multi-Object Meshes

no code implementations7 Dec 2023 Xuying Zhang, Bo-Wen Yin, Yuming Chen, Zheng Lin, Yunheng Li, Qibin Hou, Ming-Ming Cheng

Particularly, a cross-modal graph is constructed to align the object points accurately and noun phrases decoupled from the 3D mesh and textual description.

Graph Attention Object

ChatAnything: Facetime Chat with LLM-Enhanced Personas

no code implementations12 Nov 2023 Yilin Zhao, Xinbin Yuan, ShangHua Gao, Zhijie Lin, Qibin Hou, Jiashi Feng, Daquan Zhou

For MoV, we utilize the text-to-speech (TTS) algorithms with a variety of pre-defined tones and select the most matching one based on the user-provided text description automatically.

In-Context Learning Novel Concepts +2

Zone Evaluation: Revealing Spatial Bias in Object Detection

1 code implementation20 Oct 2023 Zhaohui Zheng, Yuming Chen, Qibin Hou, Xiang Li, Ping Wang, Ming-Ming Cheng

A fundamental limitation of object detectors is that they suffer from "spatial bias", and in particular perform less satisfactorily when detecting objects near image borders.

Object object-detection +1

MaskDiffusion: Boosting Text-to-Image Consistency with Conditional Mask

no code implementations8 Sep 2023 Yupeng Zhou, Daquan Zhou, Zuo-Liang Zhu, Yaxing Wang, Qibin Hou, Jiashi Feng

In this work, we identify that a crucial factor leading to the text-image mismatch issue is the inadequate cross-modality relation learning between the prompt and the output image.

YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection

1 code implementation10 Aug 2023 Yuming Chen, Xinbin Yuan, Ruiqi Wu, Jiabao Wang, Qibin Hou, Ming-Ming Cheng

We aim at providing the object detection community with an efficient and performant object detector, termed YOLO-MS.

Object object-detection +2

CrossKD: Cross-Head Knowledge Distillation for Object Detection

1 code implementation20 Jun 2023 Jiabao Wang, Yuming Chen, Zhaohui Zheng, Xiang Li, Ming-Ming Cheng, Qibin Hou

Moreover, as mimicking the teacher's predictions is the target of KD, CrossKD offers more task-oriented information in contrast with feature imitation.

Dense Object Detection Knowledge Distillation +3

Referring Camouflaged Object Detection

1 code implementation13 Jun 2023 Xuying Zhang, Bowen Yin, Zheng Lin, Qibin Hou, Deng-Ping Fan, Ming-Ming Cheng

We consider the problem of referring camouflaged object detection (Ref-COD), a new task that aims to segment specified camouflaged objects based on a small set of referring images with salient target objects.

Object object-detection +1

CorrMatch: Label Propagation via Correlation Matching for Semi-Supervised Semantic Segmentation

1 code implementation7 Jun 2023 Boyuan Sun, YuQi Yang, Le Zhang, Ming-Ming Cheng, Qibin Hou

Motivated by these, we aim to improve the use efficiency of unlabeled data by designing two novel label propagation strategies.

Segmentation Semi-Supervised Semantic Segmentation

Delving Deeper into Data Scaling in Masked Image Modeling

no code implementations24 May 2023 Cheng-Ze Lu, Xiaojie Jin, Qibin Hou, Jun Hao Liew, Ming-Ming Cheng, Jiashi Feng

The study reveals that: 1) MIM can be viewed as an effective method to improve the model capacity when the scale of the training data is relatively small; 2) Strong reconstruction targets can endow the models with increased capacities on downstream tasks; 3) MIM pre-training is data-agnostic under most scenarios, which means that the strategy of sampling pre-training data is non-critical.

Self-Supervised Learning

StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing

1 code implementation28 Mar 2023 Senmao Li, Joost Van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang

A significant research effort is focused on exploiting the amazing capacities of pretrained diffusion models for the editing of images.

Text-based Image Editing

SRFormer: Permuted Self-Attention for Single Image Super-Resolution

1 code implementation ICCV 2023 Yupeng Zhou, Zhen Li, Chun-Le Guo, Song Bai, Ming-Ming Cheng, Qibin Hou

Previous works have shown that increasing the window size for Transformer-based image super-resolution models (e. g., SwinIR) can significantly improve the model performance but the computation overhead is also considerable.

Image Super-Resolution

Large Selective Kernel Network for Remote Sensing Object Detection

1 code implementation ICCV 2023 YuXuan Li, Qibin Hou, Zhaohui Zheng, Ming-Ming Cheng, Jian Yang, Xiang Li

To the best of our knowledge, this is the first time that large and selective kernel mechanisms have been explored in the field of remote sensing object detection.

Object object-detection +3

CMAE-V: Contrastive Masked Autoencoders for Video Action Recognition

no code implementations15 Jan 2023 Cheng-Ze Lu, Xiaojie Jin, Zhicheng Huang, Qibin Hou, Ming-Ming Cheng, Jiashi Feng

Contrastive Masked Autoencoder (CMAE), as a new self-supervised framework, has shown its potential of learning expressive feature representations in visual image recognition.

Action Recognition Temporal Action Localization

Towards Spatial Equilibrium Object Detection

1 code implementation14 Jan 2023 Zhaohui Zheng, Yuming Chen, Qibin Hou, Xiang Li, Ming-Ming Cheng

In this paper, we study the spatial disequilibrium problem of modern object detectors and propose to quantify this ``spatial bias'' by measuring the detection performance over zones.

Object object-detection +1

Deep Negative Correlation Classification

no code implementations14 Dec 2022 Le Zhang, Qibin Hou, Yun Liu, Jia-Wang Bian, Xun Xu, Joey Tianyi Zhou, Ce Zhu

Ensemble learning serves as a straightforward way to improve the performance of almost any machine learning algorithm.

Classification Ensemble Learning

Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition

1 code implementation22 Nov 2022 Qibin Hou, Cheng-Ze Lu, Ming-Ming Cheng, Jiashi Feng

This paper does not attempt to design a state-of-the-art method for visual recognition but investigates a more efficient way to make use of convolutions to encode spatial features.

object-detection Object Detection +1

SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation

3 code implementations18 Sep 2022 Meng-Hao Guo, Cheng-Ze Lu, Qibin Hou, ZhengNing Liu, Ming-Ming Cheng, Shi-Min Hu

Notably, SegNeXt outperforms EfficientNet-L2 w/ NAS-FPN and achieves 90. 6% mIoU on the Pascal VOC 2012 test leaderboard using only 1/10 parameters of it.

Segmentation Semantic Segmentation

Contrastive Masked Autoencoders are Stronger Vision Learners

1 code implementation27 Jul 2022 Zhicheng Huang, Xiaojie Jin, Chengze Lu, Qibin Hou, Ming-Ming Cheng, Dongmei Fu, Xiaohui Shen, Jiashi Feng

The momentum encoder, fed with the full images, enhances the feature discriminability via contrastive learning with its online counterpart.

Contrastive Learning Image Classification +3

Localization Distillation for Object Detection

1 code implementation12 Apr 2022 Zhaohui Zheng, Rongguang Ye, Qibin Hou, Dongwei Ren, Ping Wang, WangMeng Zuo, Ming-Ming Cheng

Combining these two new components, for the first time, we show that logit mimicking can outperform feature imitation and the absence of localization distillation is a critical reason for why logit mimicking underperforms for years.

Knowledge Distillation Object +2

L2G: A Simple Local-to-Global Knowledge Transfer Framework for Weakly Supervised Semantic Segmentation

1 code implementation CVPR 2022 Peng-Tao Jiang, YuQi Yang, Qibin Hou, Yunchao Wei

Our framework conducts the global network to learn the captured rich object detail knowledge from a global view and thereby produces high-quality attention maps that can be directly used as pseudo annotations for semantic segmentation networks.

Object Transfer Learning +2

VOLO: Vision Outlooker for Visual Recognition

7 code implementations24 Jun 2021 Li Yuan, Qibin Hou, Zihang Jiang, Jiashi Feng, Shuicheng Yan

Though recently the prevailing vision transformers (ViTs) have shown great potential of self-attention based models in ImageNet classification, their performance is still inferior to that of the latest SOTA CNNs if no extra data are provided.

Domain Generalization Image Classification +1

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition

4 code implementations23 Jun 2021 Qibin Hou, Zihang Jiang, Li Yuan, Ming-Ming Cheng, Shuicheng Yan, Jiashi Feng

By realizing the importance of the positional information carried by 2D feature representations, unlike recent MLP-like models that encode the spatial information along the flattened spatial dimensions, Vision Permutator separately encodes the feature representations along the height and width dimensions with linear projections.

LV-BERT: Exploiting Layer Variety for BERT

1 code implementation Findings (ACL) 2021 Weihao Yu, Zihang Jiang, Fei Chen, Qibin Hou, Jiashi Feng

In this paper, beyond this stereotyped layer pattern, we aim to improve pre-trained models by exploiting layer variety from two aspects: the layer type set and the layer order.

LayerCAM: Exploring Hierarchical Class Activation Maps for Localization

3 code implementations IEEE 2021 Peng-Tao Jiang, Chang-Bin Zhang, Qibin Hou, Ming-Ming Cheng, Yunchao Wei

To evaluate the quality of the class activation maps produced by LayerCAM, we apply them to weakly-supervised object localization and semantic segmentation.

Object Semantic Segmentation +1

Refiner: Refining Self-attention for Vision Transformers

1 code implementation7 Jun 2021 Daquan Zhou, Yujun Shi, Bingyi Kang, Weihao Yu, Zihang Jiang, Yuan Li, Xiaojie Jin, Qibin Hou, Jiashi Feng

Vision Transformers (ViTs) have shown competitive accuracy in image classification tasks compared with CNNs.

Image Classification

FakeMix Augmentation Improves Transparent Object Detection

1 code implementation24 Mar 2021 Yang Cao, Zhengqiang Zhang, Enze Xie, Qibin Hou, Kai Zhao, Xiangui Luo, Jian Tuo

However, these methods usually encounter boundary-related imbalance problem, leading to limited generation capability.

Data Augmentation Object +3

DeepViT: Towards Deeper Vision Transformer

5 code implementations22 Mar 2021 Daquan Zhou, Bingyi Kang, Xiaojie Jin, Linjie Yang, Xiaochen Lian, Zihang Jiang, Qibin Hou, Jiashi Feng

In this paper, we show that, unlike convolution neural networks (CNNs)that can be improved by stacking more convolutional layers, the performance of ViTs saturate fast when scaled to be deeper.

Image Classification Representation Learning

AutoSpace: Neural Architecture Search with Less Human Interference

1 code implementation ICCV 2021 Daquan Zhou, Xiaojie Jin, Xiaochen Lian, Linjie Yang, Yujing Xue, Qibin Hou, Jiashi Feng

Current neural architecture search (NAS) algorithms still require expert knowledge and effort to design a search space for network construction.

Neural Architecture Search

Coordinate Attention for Efficient Mobile Network Design

2 code implementations CVPR 2021 Qibin Hou, Daquan Zhou, Jiashi Feng

Recent studies on mobile network design have demonstrated the remarkable effectiveness of channel attention (e. g., the Squeeze-and-Excitation attention) for lifting model performance, but they generally neglect the positional information, which is important for generating spatially selective attention maps.

object-detection Object Detection +1

Localization Distillation for Dense Object Detection

2 code implementations CVPR 2022 Zhaohui Zheng, Rongguang Ye, Ping Wang, Dongwei Ren, WangMeng Zuo, Qibin Hou, Ming-Ming Cheng

Previous KD methods for object detection mostly focus on imitating deep features within the imitation regions instead of mimicking classification logit due to its inefficiency in distilling localization information and trivial improvement.

Dense Object Detection Knowledge Distillation +2

Delving Deep into Label Smoothing

2 code implementations25 Nov 2020 Chang-Bin Zhang, Peng-Tao Jiang, Qibin Hou, Yunchao Wei, Qi Han, Zhen Li, Ming-Ming Cheng

Experiments demonstrate that based on the same classification models, the proposed approach can effectively improve the classification performance on CIFAR-100, ImageNet, and fine-grained datasets.

Classification General Classification

Rotate to Attend: Convolutional Triplet Attention Module

5 code implementations6 Oct 2020 Diganta Misra, Trikay Nalamada, Ajay Uppili Arasanipalai, Qibin Hou

In this paper, we investigate light-weight but effective attention mechanisms and present triplet attention, a novel method for computing attention weights by capturing cross-dimension interaction using a three-branch structure.

Image Classification Instance Segmentation +3

Rethinking Bottleneck Structure for Efficient Mobile Network Design

4 code implementations ECCV 2020 Zhou Daquan, Qibin Hou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

In this paper, we rethink the necessity of such design changes and find it may bring risks of information loss and gradient confusion.

General Classification Neural Architecture Search +2

Multi-Miner: Object-Adaptive Region Mining for Weakly-Supervised Semantic Segmentation

no code implementations14 Jun 2020 Kuangqi Zhou, Qibin Hou, Zun Li, Jiashi Feng

In this paper, we propose a novel multi-miner framework to perform a region mining process that adapts to diverse object sizes and is thus able to mine more integral and finer object regions.

Object Segmentation +2

Dynamic Feature Integration for Simultaneous Detection of Salient Object, Edge and Skeleton

no code implementations18 Apr 2020 Jiang-Jiang Liu, Qibin Hou, Ming-Ming Cheng

To evaluate the performance of our proposed network on these tasks, we conduct exhaustive experiments on multiple representative datasets.

Edge Detection Semantic Segmentation

Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

2 code implementations CVPR 2020 Qibin Hou, Li Zhang, Ming-Ming Cheng, Jiashi Feng

Spatial pooling has been proven highly effective in capturing long-range contextual information for pixel-wise prediction tasks, such as scene parsing.

Scene Parsing Semantic Segmentation

Semantic Domain Adversarial Networks for Unsupervised Domain Adaptation

no code implementations30 Mar 2020 Dapeng Hu, Jian Liang, Qibin Hou, Hanshu Yan, Yunpeng Chen, Shuicheng Yan, Jiashi Feng

To successfully align the multi-modal data structures across domains, the following works exploit discriminative information in the adversarial training process, e. g., using multiple class-wise discriminators and introducing conditional information in input or output of the domain discriminator.

Object Recognition Semantic Segmentation +1

Cross-layer Feature Pyramid Network for Salient Object Detection

no code implementations25 Feb 2020 Zun Li, Congyan Lang, Junhao Liew, Qibin Hou, Yidong Li, Jiashi Feng

Feature pyramid network (FPN) based models, which fuse the semantics and salient details in a progressive manner, have been proven highly effective in salient object detection.

Object object-detection +2

PROTOTYPE-ASSISTED ADVERSARIAL LEARNING FOR UNSUPERVISED DOMAIN ADAPTATION

no code implementations25 Sep 2019 Dapeng Hu, Jian Liang*, Qibin Hou, Hanshu Yan, Jiashi Feng

Previous adversarial learning methods condition domain alignment only on pseudo labels, but noisy and inaccurate pseudo labels may perturb the multi-class distribution embedded in probabilistic predictions, hence bringing insufficient alleviation to the latent mismatch problem.

Object Recognition Semantic Segmentation +1

Neural Epitome Search for Architecture-Agnostic Network Compression

no code implementations ICLR 2020 Daquan Zhou, Xiaojie Jin, Qibin Hou, Kaixin Wang, Jianchao Yang, Jiashi Feng

The recent WSNet [1] is a new model compression method through sampling filterweights from a compact set and has demonstrated to be effective for 1D convolutionneural networks (CNNs).

Model Compression Neural Architecture Search

A Simple Pooling-Based Design for Real-Time Salient Object Detection

5 code implementations CVPR 2019 Jiang-Jiang Liu, Qibin Hou, Ming-Ming Cheng, Jiashi Feng, Jianmin Jiang

We further design a feature aggregation module (FAM) to make the coarse-level semantic information well fused with the fine-level features from the top-down pathway.

object-detection RGB Salient Object Detection +1

Self-Erasing Network for Integral Object Attention

no code implementations NeurIPS 2018 Qibin Hou, Peng-Tao Jiang, Yunchao Wei, Ming-Ming Cheng

To test the quality of the generated attention maps, we employ the mined object regions as heuristic cues for learning semantic segmentation models.

Object Semantic Segmentation

Associating Inter-Image Salient Instances for Weakly Supervised Semantic Segmentation

no code implementations ECCV 2018 Ruochen Fan, Qibin Hou, Ming-Ming Cheng, Gang Yu, Ralph R. Martin, Shi-Min Hu

We also combine our method with Mask R-CNN for instance segmentation, and demonstrated for the first time the ability of weakly supervised instance segmentation using only keyword annotations.

Clustering graph partitioning +6

Three Birds One Stone: A General Architecture for Salient Object Segmentation, Edge Detection and Skeleton Extraction

no code implementations27 Mar 2018 Qibin Hou, Jiang-Jiang Liu, Ming-Ming Cheng, Ali Borji, Philip H. S. Torr

Although these tasks are inherently very different, we show that our unified approach performs very well on all of them and works far better than current single-purpose state-of-the-art methods.

Edge Detection Semantic Segmentation

WebSeg: Learning Semantic Segmentation from Web Searches

no code implementations27 Mar 2018 Qibin Hou, Ming-Ming Cheng, Jiang-Jiang Liu, Philip H. S. Torr

In this paper, we improve semantic segmentation by automatically learning from Flickr images associated with a particular keyword, without relying on any explicit user annotations, thus substantially alleviating the dependence on accurate annotations when compared to previous weakly supervised methods.

Segmentation Semantic Segmentation

Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground

no code implementations ECCV 2018 Deng-Ping Fan, Ming-Ming Cheng, Jiang-Jiang Liu, Shang-Hua Gao, Qibin Hou, Ali Borji

Our analysis identifies a serious design bias of existing SOD datasets which assumes that each image contains at least one clearly outstanding salient object in low clutter.

Attribute Object +3

S4Net: Single Stage Salient-Instance Segmentation

1 code implementation CVPR 2019 Ruochen Fan, Ming-Ming Cheng, Qibin Hou, Tai-Jiang Mu, Jingdong Wang, Shi-Min Hu

Taking into account the category-independent property of each target, we design a single stage salient instance segmentation framework, with a novel segmentation branch.

Instance Segmentation Segmentation +1

FLIC: Fast Linear Iterative Clustering with Active Search

no code implementations6 Dec 2016 Jia-Xing Zhao, Ren Bo, Qibin Hou, Ming-Ming Cheng, Paul L. Rosin

It also has drawbacks on convergence rate as a result of both the fixed search region and separately doing the assignment step and the update step.

Clustering Segmentation

Deeply supervised salient object detection with short connections

4 code implementations CVPR 2017 Qibin Hou, Ming-Ming Cheng, Xiao-Wei Hu, Ali Borji, Zhuowen Tu, Philip Torr

Recent progress on saliency detection is substantial, benefiting mostly from the explosive development of Convolutional Neural Networks (CNNs).

Boundary Detection Object +5

Salient Object Detection: A Survey

no code implementations18 Nov 2014 Ali Borji, Ming-Ming Cheng, Qibin Hou, Huaizu Jiang, Jia Li

Detecting and segmenting salient objects from natural scenes, often referred to as salient object detection, has attracted great interest in computer vision.

Object object-detection +4

Cannot find the paper you are looking for? You can Submit a new open access paper.