Search Results for author: Xuming He

Found 66 papers, 27 papers with code

Dynamic Grained Encoder for Vision Transformers

1 code implementation NeurIPS 2021 Lin Song, Songyang Zhang, Songtao Liu, Zeming Li, Xuming He, Hongbin Sun, Jian Sun, Nanning Zheng

Specifically, we propose a Dynamic Grained Encoder for vision transformers, which can adaptively assign a suitable number of queries to each spatial region.

Image Classification Language Modelling +2

Modeling Multimodal Aleatoric Uncertainty in Segmentation with Mixture of Stochastic Expert

no code implementations14 Dec 2022 Zhitong Gao, Yucong Chen, Chuyu Zhang, Xuming He

In this work, we focus on capturing the data-inherent uncertainty (aka aleatoric uncertainty) in segmentation, typically when ambiguities exist in input images.

CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention

1 code implementation28 Sep 2022 Ziyu Guo, Renrui Zhang, Longtian Qiu, Xianzheng Ma, Xupeng Miao, Xuming He, Bin Cui

Contrastive Language-Image Pre-training (CLIP) has been shown to learn visual representations with great transferability, which achieves promising accuracy for zero-shot classification.

Transfer Learning Zero-Shot Learning

Part-aware Prototypical Graph Network for One-shot Skeleton-based Action Recognition

no code implementations19 Aug 2022 Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Qian He, Chuanyang Hu, Errui Ding, Yu Guan, Xuming He

In this paper, we study the problem of one-shot skeleton-based action recognition, which poses unique challenges in learning transferable representation from base classes to novel classes, particularly for fine-grained actions.

Action Recognition Meta-Learning +1

Learning Semantic Correspondence with Sparse Annotations

no code implementations15 Aug 2022 Shuaiyi Huang, Luyu Yang, Bo He, Songyang Zhang, Xuming He, Abhinav Shrivastava

In this paper, we aim to address the challenge of label sparsity in semantic correspondence by enriching supervision signals from sparse keypoint annotations.

Denoising Semantic correspondence

A Novel Unified Conditional Score-based Generative Framework for Multi-modal Medical Image Completion

no code implementations7 Jul 2022 Xiangxi Meng, Yuning Gu, Yongsheng Pan, Nizhuan Wang, Peng Xue, Mengkang Lu, Xuming He, Yiqiang Zhan, Dinggang Shen

Multi-modal medical image completion has been extensively applied to alleviate the missing modality issue in a wealth of multi-modal diagnostic tasks.

Mutual Information-guided Knowledge Transfer for Novel Class Discovery

no code implementations24 Jun 2022 Chuyu Zhang, Chuanyang Hu, Ruijie Xu, Zhitong Gao, Qian He, Xuming He

Our insight is to utilize mutual information to measure the relation between seen classes and unseen classes in a restricted label space and maximizing mutual information promotes transferring semantic knowledge.

Novel Class Discovery Transfer Learning

Automatic spinal curvature measurement on ultrasound spine images using Faster R-CNN

no code implementations17 Apr 2022 Zhichao Liu, Liyue Qian, Wenke Jing, Desen Zhou, Xuming He, Edmond Lou, Rui Zheng

The framework consisted of two closely linked modules: 1) the lamina detector for identifying and locating each lamina pairs on ultrasound coronal images, and 2) the spinal curvature estimator for calculating the scoliotic angles based on the chain of detected lamina.

General Incremental Learning with Domain-aware Categorical Representations

no code implementations CVPR 2022 Jiangwei Xie, Shipeng Yan, Xuming He

Continual learning is an important problem for achieving human-level intelligence in real-world applications as an agent must continuously accumulate knowledge in response to streaming data/tasks.

class-incremental learning Incremental Learning

Intention-aware Feature Propagation Network for Interactive Segmentation

no code implementations10 Mar 2022 Chuyu Zhang, Chuanyang Hu, Yongfei Liu, Xuming He

We aim to tackle the problem of point-based interactive segmentation, in which two key challenges are to infer user's intention correctly and to propagate the user-provided annotations to unlabeled regions efficiently.

Interactive Segmentation

Weakly Supervised Nuclei Segmentation via Instance Learning

1 code implementation3 Feb 2022 Weizhen Liu, Qian He, Xuming He

Weakly supervised nuclei segmentation is a critical problem for pathological image analysis and greatly benefits the community due to the significant reduction of labeling cost.

Instance Segmentation Representation Learning +1

Budget-aware Few-shot Learning via Graph Convolutional Network

no code implementations7 Jan 2022 Shipeng Yan, Songyang Zhang, Xuming He

In this work, we introduce a new budget-aware few-shot learning problem that not only aims to learn novel object categories, but also needs to select informative examples to annotate in order to achieve data efficiency.

Few-Shot Learning Informativeness

SGTR: End-to-end Scene Graph Generation with Transformer

1 code implementation CVPR 2022 Rongjie Li, Songyang Zhang, Xuming He

Scene Graph Generation (SGG) remains a challenging visual understanding task due to its compositional property.

graph construction Graph Generation +1

KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation

1 code implementation Findings (NAACL) 2022 Yongfei Liu, Chenfei Wu, Shao-Yen Tseng, Vasudev Lal, Xuming He, Nan Duan

Self-supervised vision-and-language pretraining (VLP) aims to learn transferable multi-modal representations from large-scale image-text data and to achieve strong performances on a broad scope of vision-language tasks after finetuning.

Knowledge Distillation Representation Learning

Single Image 3D Object Estimation with Primitive Graph Networks

1 code implementation9 Sep 2021 Qian He, Desen Zhou, Bo Wan, Xuming He

To address those challenges, we adopt a primitive-based representation for 3D object, and propose a two-stage graph network for primitive-based 3D object estimation, which consists of a sequential proposal module and a graph reasoning module.

Scene Understanding

Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition

1 code implementation10 Aug 2021 Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Yu Guan, Xuming He, Errui Ding

The task of skeleton-based action recognition remains a core challenge in human-centred scene understanding due to the multiple granularities and large variation in human motion.

Action Recognition Scene Understanding +1

An EM Framework for Online Incremental Learning of Semantic Segmentation

1 code implementation8 Aug 2021 Shipeng Yan, Jiale Zhou, Jiangwei Xie, Songyang Zhang, Xuming He

Incremental learning of semantic segmentation has emerged as a promising strategy for visual scene interpretation in the open- world setting.

Incremental Learning Semantic Segmentation

Superpixel-guided Iterative Learning from Noisy Labels for Medical Image Segmentation

1 code implementation21 Jul 2021 Shuailin Li, Zhitong Gao, Xuming He

Learning segmentation from noisy labels is an important task for medical image analysis due to the difficulty in acquiring highquality annotations.

Image Segmentation Medical Image Segmentation +2

Learning Implicit Temporal Alignment for Few-shot Video Classification

1 code implementation11 May 2021 Songyang Zhang, Jiale Zhou, Xuming He

Few-shot video classification aims to learn new video categories with only a few labeled examples, alleviating the burden of costly annotation in real-world applications.

Action Recognition In Videos Classification +2

Weakly Supervised Volumetric Segmentation via Self-taught Shape Denoising Model

1 code implementation27 Apr 2021 Qian He, Shuailin Li, Xuming He

Moreover, we introduce a weak annotation scheme with a hybrid label design for volumetric images, which improves model learning without increasing the overall annotation cost.

Denoising Weakly supervised segmentation

GNeRF: GAN-based Neural Radiance Field without Posed Camera

1 code implementation ICCV 2021 Quan Meng, Anpei Chen, Haimin Luo, Minye Wu, Hao Su, Lan Xu, Xuming He, Jingyi Yu

We introduce GNeRF, a framework to marry Generative Adversarial Networks (GAN) with Neural Radiance Field (NeRF) reconstruction for the complex scenarios with unknown and even randomly initialized camera poses.

Novel View Synthesis

Relation-aware Instance Refinement for Weakly Supervised Visual Grounding

1 code implementation CVPR 2021 Yongfei Liu, Bo Wan, Lin Ma, Xuming He

Visual grounding, which aims to build a correspondence between visual objects and their language entities, plays a key role in cross-modal scene understanding.

Scene Understanding Visual Grounding

Smoothed Quantile Regression with Large-Scale Inference

1 code implementation9 Dec 2020 Xuming He, Xiaoou Pan, Kean Ming Tan, Wen-Xin Zhou

Our numerical studies confirm the conquer estimator as a practical and reliable approach to large-scale inference for quantile regression.

Statistics Theory Methodology Statistics Theory

Confidence-aware Adversarial Learning for Self-supervised Semantic Matching

no code implementations25 Aug 2020 Shuaiyi Huang, Qiuyue Wang, Xuming He

We are the first that exploit confidence during refinement to improve semantic matching accuracy and develop an end-to-end self-supervised adversarial learning procedure for the entire matching network.

Self-Supervised Learning Semantic correspondence

LGNN: A Context-aware Line Segment Detector

no code implementations13 Aug 2020 Quan Meng, Jiakai Zhang, Qiang Hu, Xuming He, Jingyi Yu

We present a novel real-time line segment detection scheme called Line Graph Neural Network (LGNN).

Line Segment Detection

Towards Purely Unsupervised Disentanglement of Appearance and Shape for Person Images Generation

no code implementations26 Jul 2020 Hongtao Yang, Tong Zhang, Wenbing Huang, Xuming He, Fatih Porikli

To be clear, in this paper, we refer unsupervised learning as learning without task-specific human annotations, pairs or any form of weak supervision.)

Disentanglement

Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

4 code implementations21 Jul 2020 Shuailin Li, Chuyu Zhang, Xuming He

Semi-supervised learning has attracted much attention in medical image segmentation due to challenges in acquiring pixel-wise image annotations, which is a crucial step for building high-performance deep learning methods.

3D Semantic Segmentation Image Segmentation +1

Learning Context-aware Task Reasoning for Efficient Meta-reinforcement Learning

no code implementations3 Mar 2020 Haozhe Wang, Jiale Zhou, Xuming He

Despite recent success of deep network-based Reinforcement Learning (RL), it remains elusive to achieve human-level efficiency in learning novel tasks.

Meta-Learning Meta Reinforcement Learning +2

Learning a Layout Transfer Network for Context Aware Object Detection

no code implementations9 Dec 2019 Tao Wang, Xuming He, Yuanzheng Cai, Guobao Xiao

We present a context aware object detection method based on a retrieve-and-transform scene layout model.

Autonomous Driving object-detection +1

Learning Cross-modal Context Graph for Visual Grounding

2 code implementations20 Nov 2019 Yongfei Liu, Bo Wan, Xiaodan Zhu, Xuming He

To address their limitations, this paper proposes a language-guided graph representation to capture the global context of grounding entities and their relations, and develop a cross-modal graph matching strategy for the multiple-phrase visual grounding task.

Graph Matching Visual Grounding

Pose-aware Multi-level Feature Network for Human Object Interaction Detection

1 code implementation ICCV 2019 Bo Wan, Desen Zhou, Yongfei Liu, Rongjie Li, Xuming He

Reasoning human object interactions is a core problem in human-centric scene understanding and detecting such relations poses a unique challenge to vision systems due to large variations in human-object configurations, multiple co-occurring relation instances and subtle visual difference between relation categories.

Human-Object Interaction Detection Scene Understanding

Dynamic Context Correspondence Network for Semantic Alignment

1 code implementation ICCV 2019 Shuaiyi Huang, Qiuyue Wang, Songyang Zhang, Shipeng Yan, Xuming He

We instantiate our strategy by designing an end-to-end learnable deep network, named as Dynamic Context Correspondence Network (DCCNet).

Semantic correspondence

LatentGNN: Learning Efficient Non-local Relations for Visual Recognition

1 code implementation28 May 2019 Songyang Zhang, Shipeng Yan, Xuming He

A promising strategy is to model the feature context by a fully-connected graph neural network (GNN), which augments traditional convolutional features with an estimated non-local context representation.

Fixed-price Diffusion Mechanism Design

no code implementations14 May 2019 Tianyi Zhang, Dengji Zhao, Wen Zhang, Xuming He

We consider a fixed-price mechanism design setting where a seller sells one item via a social network, but the seller can only directly communicate with her neighbours initially.

SemStyle: Learning to Generate Stylised Image Captions using Unaligned Text

1 code implementation CVPR 2018 Alexander Mathews, Lexing Xie, Xuming He

We develop a model that learns to generate visually relevant styled captions from a large corpus of styled text without aligned images.

Image Captioning Language Modelling

Simplifying Sentences with Sequence to Sequence Models

no code implementations15 May 2018 Alexander Mathews, Lexing Xie, Xuming He

We simplify sentences with an attentive neural network sequence to sequence model, dubbed S4.

Style Transfer Text Generation +1

Geometry-aware Deep Network for Single-Image Novel View Synthesis

no code implementations CVPR 2018 Miaomiao Liu, Xuming He, Mathieu Salzmann

By contrast, in this paper, we propose to exploit the 3D geometry of the scene to synthesize a novel view.

Novel View Synthesis

Deep Free-Form Deformation Network for Object-Mask Registration

no code implementations ICCV 2017 Haoyang Zhang, Xuming He

In this work, we take a transformation based approach that predicts a 2D non-rigid spatial transform and warps the shape mask onto the target object.

Semantic Segmentation

Indoor Scene Parsing With Instance Segmentation, Semantic Labeling and Support Relationship Inference

no code implementations CVPR 2017 Wei Zhuo, Mathieu Salzmann, Xuming He, Miaomiao Liu

In particular, while some of them aim at segmenting the image into regions, such as object or surface instances, others aim at inferring the semantic labels of given regions, or their support relationships.

Instance Segmentation Scene Parsing +1

Predicting Salient Face in Multiple-Face Videos

1 code implementation CVPR 2017 Yufan Liu, Songyang Zhang, Mai Xu, Xuming He

On the other hand, we find that the attention of different subjects consistently focuses on a single face in each frame of videos involving multiple faces.

Saliency Prediction

Boundary-aware Instance Segmentation

no code implementations CVPR 2017 Zeeshan Hayder, Xuming He, Mathieu Salzmann

In this context, existing methods typically propose candidate objects, usually as bounding boxes, and directly predict a binary mask within each such proposal.

Instance Segmentation Object Proposal Generation +1

Learning Dynamic Hierarchical Models for Anytime Scene Labeling

no code implementations11 Aug 2016 Buyu Liu, Xuming He

With increasing demand for efficient image and video analysis, test-time cost of scene parsing becomes critical for many large-scale or time-sensitive vision applications.

Model Selection Representation Learning +2

Learning deep structured network for weakly supervised change detection

no code implementations7 Jun 2016 Salman H. Khan, Xuming He, Fatih Porikli, Mohammed Bennamoun, Ferdous Sohel, Roberto Togneri

We apply a constrained mean-field algorithm to estimate the pixel-level labels, and use the estimated labels to update the parameters of the CNN in an iterative EM framework.

Change Detection

Learning to Co-Generate Object Proposals With a Deep Structured Network

no code implementations CVPR 2016 Zeeshan Hayder, Xuming He, Mathieu Salzmann

In particular, we introduce a deep structured network that jointly predicts the objectness scores and the bounding box locations of multiple object candidates.

object-detection Object Detection

Semantic-Aware Depth Super-Resolution in Outdoor Scenes

no code implementations31 May 2016 Miaomiao Liu, Mathieu Salzmann, Xuming He

Despite much progress, state-of-the-art techniques suffer from two drawbacks: (i) they rely on the assumption that intensity edges coincide with depth discontinuities, which, unfortunately, is only true in controlled environments; and (ii) they typically exploit the availability of high-resolution training depth maps, which can often not be acquired in practice due to the sensors' limitations.

Super-Resolution

Structural Kernel Learning for Large Scale Multiclass Object Co-Detection

no code implementations ICCV 2015 Zeeshan Hayder, Xuming He, Mathieu Salzmann

To exploit the correlations between objects, we build a fully-connected CRF on the candidates, which explicitly incorporates both geometric layout relations across object classes and similarity relations across multiple images.

object-detection Object Detection

SentiCap: Generating Image Descriptions with Sentiments

no code implementations6 Oct 2015 Alexander Mathews, Lexing Xie, Xuming He

We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments.

Decision Making Image Captioning +1

Indoor Scene Structure Analysis for Single Image Depth Estimation

no code implementations CVPR 2015 Wei Zhuo, Mathieu Salzmann, Xuming He, Miaomiao Liu

We tackle the problem of single image depth estimation, which, without additional knowledge, suffers from many ambiguities.

Depth Estimation

Separating Objects and Clutter in Indoor Scenes

no code implementations CVPR 2015 Salman H. Khan, Xuming He, Mohammed Bennamoun, Ferdous Sohel, Roberto Togneri

Objects' spatial layout estimation and clutter identification are two important tasks to understand indoor scenes.

Multiclass Semantic Video Segmentation With Object-Level Active Inference

no code implementations CVPR 2015 Buyu Liu, Xuming He

To scale up our method, we adopt an active inference strategy to improve the efficiency, which adaptively selects object subgraphs in the object-augmented dense CRF.

Semantic Segmentation Video Segmentation +1

An Exemplar-based CRF for Multi-instance Object Segmentation

no code implementations CVPR 2014 Xuming He, Stephen Gould

We address the problem of joint detection and segmentation of multiple object instances in an image, a key step towards scene understanding.

Instance Segmentation Scene Understanding +1

Winding Number for Region-Boundary Consistent Salient Contour Extraction

no code implementations CVPR 2013 Yansheng Ming, Hongdong Li, Xuming He

The main focus is given to how to maintain the consistency (compatibility) between the region cues and the boundary cues.

Boundary Detection

A unified model of short-range and long-range motion perception

no code implementations NeurIPS 2010 Shuang Wu, Xuming He, Hongjing Lu, Alan L. Yuille

The human vision system is able to effortlessly perceive both short-range and long-range motion patterns in complex dynamic scenes.

Learning Hybrid Models for Image Annotation with Partially Labeled Data

no code implementations NeurIPS 2008 Xuming He, Richard S. Zemel

Extensive labeled data for image annotation systems, which learn to assign class labels to image regions, is difficult to obtain.

Cannot find the paper you are looking for? You can Submit a new open access paper.