Search Results for author: Xuming He

Found 50 papers, 21 papers with code

KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation

no code implementations22 Sep 2021 Yongfei Liu, Chenfei Wu, Shao-Yen Tseng, Vasudev Lal, Xuming He, Nan Duan

Phrase-region alignment task aims to improve cross-modal alignment by utilizing the similarities between noun phrases and object labels in the linguistic space.

Knowledge Distillation Representation Learning

Single Image 3D Object Estimation with Primitive Graph Networks

1 code implementation9 Sep 2021 Qian He, Desen Zhou, Bo Wan, Xuming He

To address those challenges, we adopt a primitive-based representation for 3D object, and propose a two-stage graph network for primitive-based 3D object estimation, which consists of a sequential proposal module and a graph reasoning module.

Scene Understanding

Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition

1 code implementation10 Aug 2021 Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Yu Guan, Xuming He, Errui Ding

The task of skeleton-based action recognition remains a core challenge in human-centred scene understanding due to the multiple granularities and large variation in human motion.

Action Recognition Scene Understanding +1

An EM Framework for Online Incremental Learning of Semantic Segmentation

1 code implementation8 Aug 2021 Shipeng Yan, Jiale Zhou, Jiangwei Xie, Songyang Zhang, Xuming He

Incremental learning of semantic segmentation has emerged as a promising strategy for visual scene interpretation in the open- world setting.

Incremental Learning Semantic Segmentation

Superpixel-guided Iterative Learning from Noisy Labels for Medical Image Segmentation

1 code implementation21 Jul 2021 Shuailin Li, Zhitong Gao, Xuming He

Learning segmentation from noisy labels is an important task for medical image analysis due to the difficulty in acquiring highquality annotations.

Medical Image Segmentation

Learning Implicit Temporal Alignment for Few-shot Video Classification

1 code implementation11 May 2021 Songyang Zhang, Jiale Zhou, Xuming He

Few-shot video classification aims to learn new video categories with only a few labeled examples, alleviating the burden of costly annotation in real-world applications.

Action Recognition In Videos Classification +2

Weakly Supervised Volumetric Segmentation via Self-taught Shape Denoising Model

1 code implementation27 Apr 2021 Qian He, Shuailin Li, Xuming He

Moreover, we introduce a weak annotation scheme with a hybrid label design for volumetric images, which improves model learning without increasing the overall annotation cost.

Denoising Weakly supervised segmentation

GNeRF: GAN-based Neural Radiance Field without Posed Camera

1 code implementation ICCV 2021 Quan Meng, Anpei Chen, Haimin Luo, Minye Wu, Hao Su, Lan Xu, Xuming He, Jingyi Yu

We introduce GNeRF, a framework to marry Generative Adversarial Networks (GAN) with Neural Radiance Field (NeRF) reconstruction for the complex scenarios with unknown and even randomly initialized camera poses.

Novel View Synthesis

Relation-aware Instance Refinement for Weakly Supervised Visual Grounding

1 code implementation CVPR 2021 Yongfei Liu, Bo Wan, Lin Ma, Xuming He

Visual grounding, which aims to build a correspondence between visual objects and their language entities, plays a key role in cross-modal scene understanding.

Scene Understanding Visual Grounding

Smoothed Quantile Regression with Large-Scale Inference

1 code implementation9 Dec 2020 Xuming He, Xiaoou Pan, Kean Ming Tan, Wen-Xin Zhou

Our numerical studies confirm the conquer estimator as a practical and reliable approach to large-scale inference for quantile regression.

Statistics Theory Methodology Statistics Theory

Confidence-aware Adversarial Learning for Self-supervised Semantic Matching

no code implementations25 Aug 2020 Shuaiyi Huang, Qiuyue Wang, Xuming He

We are the first that exploit confidence during refinement to improve semantic matching accuracy and develop an end-to-end self-supervised adversarial learning procedure for the entire matching network.

Self-Supervised Learning Semantic correspondence

LGNN: A Context-aware Line Segment Detector

no code implementations13 Aug 2020 Quan Meng, Jiakai Zhang, Qiang Hu, Xuming He, Jingyi Yu

We present a novel real-time line segment detection scheme called Line Graph Neural Network (LGNN).

Line Segment Detection

Towards Purely Unsupervised Disentanglement of Appearance and Shape for Person Images Generation

no code implementations26 Jul 2020 Hongtao Yang, Tong Zhang, Wenbing Huang, Xuming He, Fatih Porikli

To be clear, in this paper, we refer unsupervised learning as learning without task-specific human annotations, pairs or any form of weak supervision.)

Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

3 code implementations21 Jul 2020 Shuailin Li, Chuyu Zhang, Xuming He

Semi-supervised learning has attracted much attention in medical image segmentation due to challenges in acquiring pixel-wise image annotations, which is a crucial step for building high-performance deep learning methods.

3D Semantic Segmentation Medical Image Segmentation

Learning Context-aware Task Reasoning for Efficient Meta-reinforcement Learning

no code implementations3 Mar 2020 Haozhe Wang, Jiale Zhou, Xuming He

Despite recent success of deep network-based Reinforcement Learning (RL), it remains elusive to achieve human-level efficiency in learning novel tasks.

Meta-Learning Meta Reinforcement Learning

Learning a Layout Transfer Network for Context Aware Object Detection

no code implementations9 Dec 2019 Tao Wang, Xuming He, Yuanzheng Cai, Guobao Xiao

We present a context aware object detection method based on a retrieve-and-transform scene layout model.

Autonomous Driving Object Detection

Learning Cross-modal Context Graph for Visual Grounding

1 code implementation20 Nov 2019 Yongfei Liu, Bo Wan, Xiaodan Zhu, Xuming He

To address their limitations, this paper proposes a language-guided graph representation to capture the global context of grounding entities and their relations, and develop a cross-modal graph matching strategy for the multiple-phrase visual grounding task.

Graph Matching Visual Grounding

Pose-aware Multi-level Feature Network for Human Object Interaction Detection

1 code implementation ICCV 2019 Bo Wan, Desen Zhou, Yongfei Liu, Rongjie Li, Xuming He

Reasoning human object interactions is a core problem in human-centric scene understanding and detecting such relations poses a unique challenge to vision systems due to large variations in human-object configurations, multiple co-occurring relation instances and subtle visual difference between relation categories.

Human-Object Interaction Detection Scene Understanding

Dynamic Context Correspondence Network for Semantic Alignment

1 code implementation ICCV 2019 Shuaiyi Huang, Qiuyue Wang, Songyang Zhang, Shipeng Yan, Xuming He

We instantiate our strategy by designing an end-to-end learnable deep network, named as Dynamic Context Correspondence Network (DCCNet).

Semantic correspondence

LatentGNN: Learning Efficient Non-local Relations for Visual Recognition

1 code implementation28 May 2019 Songyang Zhang, Shipeng Yan, Xuming He

A promising strategy is to model the feature context by a fully-connected graph neural network (GNN), which augments traditional convolutional features with an estimated non-local context representation.

Fixed-price Diffusion Mechanism Design

no code implementations14 May 2019 Tianyi Zhang, Dengji Zhao, Wen Zhang, Xuming He

We consider a fixed-price mechanism design setting where a seller sells one item via a social network, but the seller can only directly communicate with her neighbours initially.

SemStyle: Learning to Generate Stylised Image Captions using Unaligned Text

1 code implementation CVPR 2018 Alexander Mathews, Lexing Xie, Xuming He

We develop a model that learns to generate visually relevant styled captions from a large corpus of styled text without aligned images.

Image Captioning Language Modelling

Simplifying Sentences with Sequence to Sequence Models

no code implementations15 May 2018 Alexander Mathews, Lexing Xie, Xuming He

We simplify sentences with an attentive neural network sequence to sequence model, dubbed S4.

Style Transfer Text Generation +1

Geometry-aware Deep Network for Single-Image Novel View Synthesis

no code implementations CVPR 2018 Miaomiao Liu, Xuming He, Mathieu Salzmann

By contrast, in this paper, we propose to exploit the 3D geometry of the scene to synthesize a novel view.

Novel View Synthesis

Deep Free-Form Deformation Network for Object-Mask Registration

no code implementations ICCV 2017 Haoyang Zhang, Xuming He

In this work, we take a transformation based approach that predicts a 2D non-rigid spatial transform and warps the shape mask onto the target object.

Semantic Segmentation

Indoor Scene Parsing With Instance Segmentation, Semantic Labeling and Support Relationship Inference

no code implementations CVPR 2017 Wei Zhuo, Mathieu Salzmann, Xuming He, Miaomiao Liu

In particular, while some of them aim at segmenting the image into regions, such as object or surface instances, others aim at inferring the semantic labels of given regions, or their support relationships.

Instance Segmentation Scene Parsing +1

Predicting Salient Face in Multiple-Face Videos

1 code implementation CVPR 2017 Yufan Liu, Songyang Zhang, Mai Xu, Xuming He

On the other hand, we find that the attention of different subjects consistently focuses on a single face in each frame of videos involving multiple faces.

Eye Tracking Saliency Prediction

Boundary-aware Instance Segmentation

no code implementations CVPR 2017 Zeeshan Hayder, Xuming He, Mathieu Salzmann

In this context, existing methods typically propose candidate objects, usually as bounding boxes, and directly predict a binary mask within each such proposal.

Instance Segmentation Object Proposal Generation +1

Learning Dynamic Hierarchical Models for Anytime Scene Labeling

no code implementations11 Aug 2016 Buyu Liu, Xuming He

With increasing demand for efficient image and video analysis, test-time cost of scene parsing becomes critical for many large-scale or time-sensitive vision applications.

Model Selection Representation Learning +2

Learning deep structured network for weakly supervised change detection

no code implementations7 Jun 2016 Salman H. Khan, Xuming He, Fatih Porikli, Mohammed Bennamoun, Ferdous Sohel, Roberto Togneri

We apply a constrained mean-field algorithm to estimate the pixel-level labels, and use the estimated labels to update the parameters of the CNN in an iterative EM framework.

Learning to Co-Generate Object Proposals With a Deep Structured Network

no code implementations CVPR 2016 Zeeshan Hayder, Xuming He, Mathieu Salzmann

In particular, we introduce a deep structured network that jointly predicts the objectness scores and the bounding box locations of multiple object candidates.

Object Detection

Semantic-Aware Depth Super-Resolution in Outdoor Scenes

no code implementations31 May 2016 Miaomiao Liu, Mathieu Salzmann, Xuming He

Despite much progress, state-of-the-art techniques suffer from two drawbacks: (i) they rely on the assumption that intensity edges coincide with depth discontinuities, which, unfortunately, is only true in controlled environments; and (ii) they typically exploit the availability of high-resolution training depth maps, which can often not be acquired in practice due to the sensors' limitations.

Super-Resolution

Structural Kernel Learning for Large Scale Multiclass Object Co-Detection

no code implementations ICCV 2015 Zeeshan Hayder, Xuming He, Mathieu Salzmann

To exploit the correlations between objects, we build a fully-connected CRF on the candidates, which explicitly incorporates both geometric layout relations across object classes and similarity relations across multiple images.

Object Detection

SentiCap: Generating Image Descriptions with Sentiments

no code implementations6 Oct 2015 Alexander Mathews, Lexing Xie, Xuming He

We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments.

Decision Making Image Captioning +1

Indoor Scene Structure Analysis for Single Image Depth Estimation

no code implementations CVPR 2015 Wei Zhuo, Mathieu Salzmann, Xuming He, Miaomiao Liu

We tackle the problem of single image depth estimation, which, without additional knowledge, suffers from many ambiguities.

Depth Estimation

Separating Objects and Clutter in Indoor Scenes

no code implementations CVPR 2015 Salman H. Khan, Xuming He, Mohammed Bennamoun, Ferdous Sohel, Roberto Togneri

Objects' spatial layout estimation and clutter identification are two important tasks to understand indoor scenes.

Multiclass Semantic Video Segmentation With Object-Level Active Inference

no code implementations CVPR 2015 Buyu Liu, Xuming He

To scale up our method, we adopt an active inference strategy to improve the efficiency, which adaptively selects object subgraphs in the object-augmented dense CRF.

Semantic Segmentation Video Segmentation +1

An Exemplar-based CRF for Multi-instance Object Segmentation

no code implementations CVPR 2014 Xuming He, Stephen Gould

We address the problem of joint detection and segmentation of multiple object instances in an image, a key step towards scene understanding.

Instance Segmentation Scene Understanding +1

Discrete-Continuous Depth Estimation from a Single Image

no code implementations CVPR 2014 Miaomiao Liu, Mathieu Salzmann, Xuming He

The unary potentials in this graphical model are computed by making use of the images with known depth.

Monocular Depth Estimation

Winding Number for Region-Boundary Consistent Salient Contour Extraction

no code implementations CVPR 2013 Yansheng Ming, Hongdong Li, Xuming He

The main focus is given to how to maintain the consistency (compatibility) between the region cues and the boundary cues.

Boundary Detection

Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning

no code implementations CVPR 2013 Tao Wang, Xuming He, Nick Barnes

We propose a structured Hough voting method for detecting objects with heavy occlusion in indoor environments.

Object Detection

A unified model of short-range and long-range motion perception

no code implementations NeurIPS 2010 Shuang Wu, Xuming He, Hongjing Lu, Alan L. Yuille

The human vision system is able to effortlessly perceive both short-range and long-range motion patterns in complex dynamic scenes.

Learning Hybrid Models for Image Annotation with Partially Labeled Data

no code implementations NeurIPS 2008 Xuming He, Richard S. Zemel

Extensive labeled data for image annotation systems, which learn to assign class labels to image regions, is difficult to obtain.

Cannot find the paper you are looking for? You can Submit a new open access paper.