Search Results for author: Xuming He

Found 86 papers, 41 papers with code

SP$^2$OT: Semantic-Regularized Progressive Partial Optimal Transport for Imbalanced Clustering

1 code implementation • 4 Apr 2024 • Chuyu Zhang, Hui Ren, Xuming He

To be more precise, we employ the strategy of majorization to reformulate the SP$^2$OT problem into a Progressive Partial Optimal Transport problem, which can be transformed into an unbalanced optimal transport problem with augmented constraints and can be solved efficiently by a fast matrix scaling algorithm.

Clustering Deep Clustering +1

Paper
Code

From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models

no code implementations • 1 Apr 2024 • Rongjie Li, Songyang Zhang, Dahua Lin, Kai Chen, Xuming He

Scene graph generation (SGG) aims to parse a visual scene into an intermediate graph representation for downstream reasoning tasks.

Graph Generation Relation +2

Paper
Add Code

Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning

no code implementations • 1 Apr 2024 • Rongjie Li, Yu Wu, Xuming He

Generative vision-language models (VLMs) have shown impressive performance in zero-shot vision-language tasks like image captioning and visual question answering.

Image Captioning Instruction Following +5

Paper
Add Code

DSGG: Dense Relation Transformer for an End-to-end Scene Graph Generation

no code implementations • 21 Mar 2024 • Zeeshan Hayder, Xuming He

Scene graph generation aims to capture detailed spatial and semantic relationships between objects in an image, which is challenging due to incomplete labelling, long-tailed relationship categories, and relational semantic overlap.

Graph Generation Graph Matching +3

Paper
Add Code

RealDex: Towards Human-like Grasping for Robotic Dexterous Hand

no code implementations • 21 Feb 2024 • Yumeng Liu, Yaxun Yang, Youzhuo Wang, Xiaofei Wu, Jiamin Wang, Yichen Yao, Sören Schwertfeger, Sibei Yang, Wenping Wang, Jingyi Yu, Xuming He, Yuexin Ma

In this paper, we introduce RealDex, a pioneering dataset capturing authentic dexterous hand grasping motions infused with human behavioral patterns, enriched by multi-view and multimodal visual data.

Paper
Add Code

SGTR+: End-to-end Scene Graph Generation with Transformer

1 code implementation • 23 Jan 2024 • Rongjie Li, Songyang Zhang, Xuming He

Moreover, we design a graph assembling module to infer the connectivity of the bipartite scene graph based on our entity-aware structure, enabling us to generate the scene graph in an end-to-end manner.

graph construction Graph Generation +1

Paper
Code

P$^2$OT: Progressive Partial Optimal Transport for Deep Imbalanced Clustering

1 code implementation • 17 Jan 2024 • Chuyu Zhang, Hui Ren, Xuming He

Deep clustering, which learns representation and semantic clustering without labels information, poses a great challenge for deep learning-based approaches.

Clustering Deep Clustering +1

Paper
Code

Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training

1 code implementation • 4 Jan 2024 • Longtian Qiu, Shan Ning, Xuming He

Firstly, we observe that the CLIP's visual feature of image subregions can achieve closer proximity to the paired caption due to the inherent information loss in text descriptions.

Descriptive Image Captioning +1

Paper
Code

GenEM: Physics-Informed Generative Cryo-Electron Microscopy

no code implementations • 4 Dec 2023 • Jiakai Zhang, Qihe Chen, Yan Zeng, Wenyuan Gao, Xuming He, Zhijie Liu, Jingyi Yu

To address this, we introduce physics-informed generative cryo-electron microscopy (GenEM), which for the first time integrates physical-based cryo-EM simulation with a generative unpaired noise translation to generate physically correct synthetic cryo-EM datasets with realistic noises.

Contrastive Learning Pose Estimation +1

Paper
Add Code

Gradient-Map-Guided Adaptive Domain Generalization for Cross Modality MRI Segmentation

1 code implementation • 16 Nov 2023 • Bingnan Li, Zhitong Gao, Xuming He

Cross-modal MRI segmentation is of great value for computer-aided medical diagnosis, enabling flexible data acquisition and model generalization.

Domain Generalization Medical Diagnosis +3

Paper
Code

SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models

1 code implementation • 13 Nov 2023 • Ziyi Lin, Chris Liu, Renrui Zhang, Peng Gao, Longtian Qiu, Han Xiao, Han Qiu, Chen Lin, Wenqi Shao, Keqin Chen, Jiaming Han, Siyuan Huang, Yichi Zhang, Xuming He, Hongsheng Li, Yu Qiao

We present SPHINX, a versatile multi-modal large language model (MLLM) with a joint mixing of model weights, tuning tasks, and visual embeddings.

Ranked #2 on Visual Question Answering on BenchLMM

Described Object Detection Language Modelling +4

2,486

Paper
Code

The Robust Semantic Segmentation UNCV2023 Challenge Results

no code implementations • 27 Sep 2023 • Xuanlong Yu, Yi Zuo, Zitao Wang, Xiaowen Zhang, Jiaxuan Zhao, Yuting Yang, Licheng Jiao, Rui Peng, Xinyi Wang, Junpei Zhang, Kexin Zhang, Fang Liu, Roberto Alcover-Couso, Juan C. SanMiguel, Marcos Escudero-Viñolo, Hanlin Tian, Kenta Matsui, Tianhao Wang, Fahmy Adan, Zhitong Gao, Xuming He, Quentin Bouniot, Hossein Moghaddam, Shyam Nandan Rai, Fabio Cermelli, Carlo Masone, Andrea Pilzer, Elisa Ricci, Andrei Bursuc, Arno Solin, Martin Trapp, Rui Li, Angela Yao, Wenlong Chen, Ivor Simpson, Neill D. F. Campbell, Gianni Franchi

This paper outlines the winning solutions employed in addressing the MUAD uncertainty quantification challenge held at ICCV 2023.

Autonomous Driving Segmentation +2

Paper
Add Code

Novel Class Discovery for Long-tailed Recognition

1 code implementation • 6 Aug 2023 • Chuyu Zhang, Ruijie Xu, Xuming He

In this paper, we consider a more realistic setting for novel class discovery where the distributions of novel and known classes are long-tailed.

Novel Class Discovery

Paper
Code

Grounded Image Text Matching with Mismatched Relation Reasoning

no code implementations • ICCV 2023 • Yu Wu, Yana Wei, Haozhe Wang, Yongfei Liu, Sibei Yang, Xuming He

This paper introduces Grounded Image Text Matching with Mismatched Relation (GITM-MR), a novel visual-linguistic joint task that evaluates the relation understanding capabilities of transformer-based pre-trained models.

Image-text matching Relation +2

Paper
Add Code

Human-centric Scene Understanding for 3D Large-scale Scenarios

1 code implementation • ICCV 2023 • Yiteng Xu, Peishan Cong, Yichen Yao, Runnan Chen, Yuenan Hou, Xinge Zhu, Xuming He, Jingyi Yu, Yuexin Ma

Human-centric scene understanding is significant for real-world applications, but it is extremely challenging due to the existence of diverse human poses and actions, complex human-environment interactions, severe occlusions in crowds, etc.

Action Recognition Scene Understanding +1

Paper
Code

A Physics-Informed Data-Driven Fault Location Method for Transmission Lines Using Single-Ended Measurements with Field Data Validation

no code implementations • 19 Jul 2023 • Yiqi Xing, Yu Liu, Dayou Lu, Xinchen Zou, Xuming He

This procedure merges the gap between simulation and practical power systems, and at the same time considers the uncertainty of system and fault parameters in practice.

Paper
Add Code

Class-relation Knowledge Distillation for Novel Class Discovery

2 code implementations • ICCV 2023 • Peiyan Gu, Chuyu Zhang, Ruijie Xu, Xuming He

In addition, to enable a flexible knowledge distillation scheme for each data point in novel classes, we develop a learnable weighting function for the regularization, which adaptively promotes knowledge transfer based on the semantic similarity between the novel and known classes.

Knowledge Distillation Novel Class Discovery +4

Paper
Code

MILD: Modeling the Instance Learning Dynamics for Learning with Noisy Labels

1 code implementation • 20 Jun 2023 • Chuanyang Hu, Shipeng Yan, Zhitong Gao, Xuming He

Despite deep learning has achieved great success, it often relies on a large amount of training data with accurate labels, which are expensive and time-consuming to collect.

Learning with noisy labels Memorization

Paper
Code

HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models

1 code implementation • CVPR 2023 • Shan Ning, Longtian Qiu, Yongfei Liu, Xuming He

In detail, we first introduce a novel interaction decoder to extract informative regions in the visual feature map of CLIP via a cross-attention mechanism, which is then fused with the detection backbone by a knowledge integration block for more accurate human-object pair detection.

Ranked #8 on Human-Object Interaction Detection on V-COCO

Human-Object Interaction Detection Knowledge Distillation +2

Paper
Code

Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning

no code implementations • 2 Mar 2023 • Bo Wan, Yongfei Liu, Desen Zhou, Tinne Tuytelaars, Xuming He

Human object interaction (HOI) detection plays a crucial role in human-centric scene understanding and serves as a fundamental building-block for many vision tasks.

Human-Object Interaction Detection Knowledge Distillation +3

Paper
Add Code

Dynamic Grained Encoder for Vision Transformers

1 code implementation • NeurIPS 2021 • Lin Song, Songyang Zhang, Songtao Liu, Zeming Li, Xuming He, Hongbin Sun, Jian Sun, Nanning Zheng

Specifically, we propose a Dynamic Grained Encoder for vision transformers, which can adaptively assign a suitable number of queries to each spatial region.

Image Classification Language Modelling +2

Paper
Code

Modeling Multimodal Aleatoric Uncertainty in Segmentation with Mixture of Stochastic Experts

1 code implementation • 14 Dec 2022 • Zhitong Gao, Yucong Chen, Chuyu Zhang, Xuming He

In this work, we focus on capturing the data-inherent uncertainty (aka aleatoric uncertainty) in segmentation, typically when ambiguities exist in input images.

Segmentation

Paper
Code

Generative Negative Text Replay for Continual Vision-Language Pretraining

no code implementations • 31 Oct 2022 • Shipeng Yan, Lanqing Hong, Hang Xu, Jianhua Han, Tinne Tuytelaars, Zhenguo Li, Xuming He

In this work, we focus on learning a VLP model with sequential chunks of image-text pair data.

Continual Learning Image Classification +5

Paper
Add Code

CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention

1 code implementation • 28 Sep 2022 • Ziyu Guo, Renrui Zhang, Longtian Qiu, Xianzheng Ma, Xupeng Miao, Xuming He, Bin Cui

Contrastive Language-Image Pre-training (CLIP) has been shown to learn visual representations with great transferability, which achieves promising accuracy for zero-shot classification.

Ranked #4 on Training-free 3D Point Cloud Classification on ScanObjectNN (using extra training data)

Training-free 3D Point Cloud Classification Transfer Learning +1

Paper
Code

Part-aware Prototypical Graph Network for One-shot Skeleton-based Action Recognition

no code implementations • 19 Aug 2022 • Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Qian He, Chuanyang Hu, Errui Ding, Yu Guan, Xuming He

In this paper, we study the problem of one-shot skeleton-based action recognition, which poses unique challenges in learning transferable representation from base classes to novel classes, particularly for fine-grained actions.

Action Recognition Meta-Learning +1

Paper
Add Code

Learning Semantic Correspondence with Sparse Annotations

1 code implementation • 15 Aug 2022 • Shuaiyi Huang, Luyu Yang, Bo He, Songyang Zhang, Xuming He, Abhinav Shrivastava

In this paper, we aim to address the challenge of label sparsity in semantic correspondence by enriching supervision signals from sparse keypoint annotations.

Denoising Semantic correspondence

Paper
Code

A Novel Unified Conditional Score-based Generative Framework for Multi-modal Medical Image Completion

no code implementations • 7 Jul 2022 • Xiangxi Meng, Yuning Gu, Yongsheng Pan, Nizhuan Wang, Peng Xue, Mengkang Lu, Xuming He, Yiqiang Zhan, Dinggang Shen

Multi-modal medical image completion has been extensively applied to alleviate the missing modality issue in a wealth of multi-modal diagnostic tasks.

Paper
Add Code

Mutual Information-guided Knowledge Transfer for Novel Class Discovery

no code implementations • 24 Jun 2022 • Chuyu Zhang, Chuanyang Hu, Ruijie Xu, Zhitong Gao, Qian He, Xuming He

Our insight is to utilize mutual information to measure the relation between seen classes and unseen classes in a restricted label space and maximizing mutual information promotes transferring semantic knowledge.

Novel Class Discovery Relation +1

Paper
Add Code

ROI-Constrained Bidding via Curriculum-Guided Bayesian Reinforcement Learning

1 code implementation • 10 Jun 2022 • Haozhe Wang, Chao Du, Panyan Fang, Shuo Yuan, Xuming He, Liang Wang, Bo Zheng

Real-Time Bidding (RTB) is an important mechanism in modern online advertising systems.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Automatic spinal curvature measurement on ultrasound spine images using Faster R-CNN

no code implementations • 17 Apr 2022 • Zhichao Liu, Liyue Qian, Wenke Jing, Desen Zhou, Xuming He, Edmond Lou, Rui Zheng

The framework consisted of two closely linked modules: 1) the lamina detector for identifying and locating each lamina pairs on ultrasound coronal images, and 2) the spinal curvature estimator for calculating the scoliotic angles based on the chain of detected lamina.

Paper
Add Code

General Incremental Learning with Domain-aware Categorical Representations

no code implementations • CVPR 2022 • Jiangwei Xie, Shipeng Yan, Xuming He

Continual learning is an important problem for achieving human-level intelligence in real-world applications as an agent must continuously accumulate knowledge in response to streaming data/tasks.

Class Incremental Learning Incremental Learning

Paper
Add Code

Cascaded Sparse Feature Propagation Network for Interactive Segmentation

1 code implementation • 10 Mar 2022 • Chuyu Zhang, Chuanyang Hu, Hui Ren, Yongfei Liu, Xuming He

We aim to tackle the problem of point-based interactive segmentation, in which the key challenge is to propagate the user-provided annotations to unlabeled regions efficiently.

Ranked #4 on Interactive Segmentation on SBD

Foreground Segmentation Interactive Segmentation +2

Paper
Code

Weakly Supervised Nuclei Segmentation via Instance Learning

1 code implementation • 3 Feb 2022 • Weizhen Liu, Qian He, Xuming He

Weakly supervised nuclei segmentation is a critical problem for pathological image analysis and greatly benefits the community due to the significant reduction of labeling cost.

Instance Segmentation Representation Learning +2

Paper
Code

Budget-aware Few-shot Learning via Graph Convolutional Network

no code implementations • 7 Jan 2022 • Shipeng Yan, Songyang Zhang, Xuming He

In this work, we introduce a new budget-aware few-shot learning problem that not only aims to learn novel object categories, but also needs to select informative examples to annotate in order to achieve data efficiency.

Few-Shot Learning Informativeness

Paper
Add Code

SGTR: End-to-end Scene Graph Generation with Transformer

1 code implementation • CVPR 2022 • Rongjie Li, Songyang Zhang, Xuming He

Scene Graph Generation (SGG) remains a challenging visual understanding task due to its compositional property.

graph construction Graph Generation +1

Paper
Code

SGTR: Generating Scene Graph by Learning Compositional Triplets with Transformer

no code implementations • 29 Sep 2021 • Rongjie Li, Songyang Zhang, Xuming He

We develop a decoding-and-assembling paradigm for the end-to-end scene graph generation.

Graph Generation Scene Graph Generation

Paper
Add Code

KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation

1 code implementation • Findings (NAACL) 2022 • Yongfei Liu, Chenfei Wu, Shao-Yen Tseng, Vasudev Lal, Xuming He, Nan Duan

Self-supervised vision-and-language pretraining (VLP) aims to learn transferable multi-modal representations from large-scale image-text data and to achieve strong performances on a broad scope of vision-language tasks after finetuning.

Knowledge Distillation Object +1

Paper
Code

Single Image 3D Object Estimation with Primitive Graph Networks

1 code implementation • 9 Sep 2021 • Qian He, Desen Zhou, Bo Wan, Xuming He

To address those challenges, we adopt a primitive-based representation for 3D object, and propose a two-stage graph network for primitive-based 3D object estimation, which consists of a sequential proposal module and a graph reasoning module.

Object Scene Understanding

Paper
Code

Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition

1 code implementation • 10 Aug 2021 • Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Yu Guan, Xuming He, Errui Ding

The task of skeleton-based action recognition remains a core challenge in human-centred scene understanding due to the multiple granularities and large variation in human motion.

Ranked #8 on Skeleton Based Action Recognition on Kinetics-Skeleton dataset

Action Classification Action Recognition +2

Paper
Code

An EM Framework for Online Incremental Learning of Semantic Segmentation

1 code implementation • 8 Aug 2021 • Shipeng Yan, Jiale Zhou, Jiangwei Xie, Songyang Zhang, Xuming He

Incremental learning of semantic segmentation has emerged as a promising strategy for visual scene interpretation in the open- world setting.

Incremental Learning Missing Labels +2

Paper
Code

Workshop on Autonomous Driving at CVPR 2021: Technical Report for Streaming Perception Challenge

1 code implementation • 27 Jul 2021 • Songyang Zhang, Lin Song, Songtao Liu, Zheng Ge, Zeming Li, Xuming He, Jian Sun

In this report, we introduce our real-time 2D object detection system for the realistic autonomous driving scenario.

Autonomous Driving object-detection +1

9,010

Paper
Code

Superpixel-guided Iterative Learning from Noisy Labels for Medical Image Segmentation

1 code implementation • 21 Jul 2021 • Shuailin Li, Zhitong Gao, Xuming He

Learning segmentation from noisy labels is an important task for medical image analysis due to the difficulty in acquiring highquality annotations.

Image Segmentation Medical Image Segmentation +3

Paper
Code

Learning Implicit Temporal Alignment for Few-shot Video Classification

1 code implementation • 11 May 2021 • Songyang Zhang, Jiale Zhou, Xuming He

Few-shot video classification aims to learn new video categories with only a few labeled examples, alleviating the burden of costly annotation in real-world applications.

Ranked #1 on Action Recognition In Videos on FS-Something-Something V2-Small

Action Recognition In Videos Classification +2

Paper
Code

Weakly Supervised Volumetric Segmentation via Self-taught Shape Denoising Model

1 code implementation • 27 Apr 2021 • Qian He, Shuailin Li, Xuming He

Moreover, we introduce a weak annotation scheme with a hybrid label design for volumetric images, which improves model learning without increasing the overall annotation cost.

Denoising Organ Segmentation +2

Paper
Code

Bipartite Graph Network with Adaptive Message Passing for Unbiased Scene Graph Generation

3 code implementations • CVPR 2021 • Rongjie Li, Songyang Zhang, Bo Wan, Xuming He

Scene graph generation is an important visual understanding task with a broad range of vision applications.

Graph Generation Unbiased Scene Graph Generation

Paper
Code

DER: Dynamically Expandable Representation for Class Incremental Learning

2 code implementations • CVPR 2021 • Shipeng Yan, Jiangwei Xie, Xuming He

We address the problem of class incremental learning, which is a core step towards achieving adaptive vision intelligence.

Ranked #1 on Incremental Learning on CIFAR100B050S(2ClassesPerStep)

Class Incremental Learning Incremental Learning +1

687

Paper
Code

Distribution Alignment: A Unified Framework for Long-tail Visual Recognition

1 code implementation • CVPR 2021 • Songyang Zhang, Zeming Li, Shipeng Yan, Xuming He, Jian Sun

Motivated by our discovery, we propose a unified distribution alignment strategy for long-tail visual recognition.

Ranked #17 on Long-tail Learning on Places-LT

General Classification Image Classification +6

114

Paper
Code

GNeRF: GAN-based Neural Radiance Field without Posed Camera

1 code implementation • ICCV 2021 • Quan Meng, Anpei Chen, Haimin Luo, Minye Wu, Hao Su, Lan Xu, Xuming He, Jingyi Yu

We introduce GNeRF, a framework to marry Generative Adversarial Networks (GAN) with Neural Radiance Field (NeRF) reconstruction for the complex scenarios with unknown and even randomly initialized camera poses.

Novel View Synthesis

224

Paper
Code

Relation-aware Instance Refinement for Weakly Supervised Visual Grounding

1 code implementation • CVPR 2021 • Yongfei Liu, Bo Wan, Lin Ma, Xuming He

Visual grounding, which aims to build a correspondence between visual objects and their language entities, plays a key role in cross-modal scene understanding.

Object Relation +3

Paper
Code

Smoothed Quantile Regression with Large-Scale Inference

1 code implementation • 9 Dec 2020 • Xuming He, Xiaoou Pan, Kean Ming Tan, Wen-Xin Zhou

Our numerical studies confirm the conquer estimator as a practical and reliable approach to large-scale inference for quantile regression.

Statistics Theory Methodology Statistics Theory

Paper
Code

Confidence-aware Adversarial Learning for Self-supervised Semantic Matching

no code implementations • 25 Aug 2020 • Shuaiyi Huang, Qiuyue Wang, Xuming He

We are the first that exploit confidence during refinement to improve semantic matching accuracy and develop an end-to-end self-supervised adversarial learning procedure for the entire matching network.

Self-Supervised Learning Semantic correspondence

Paper
Add Code

LGNN: A Context-aware Line Segment Detector

no code implementations • 13 Aug 2020 • Quan Meng, Jiakai Zhang, Qiang Hu, Xuming He, Jingyi Yu

We present a novel real-time line segment detection scheme called Line Graph Neural Network (LGNN).

Line Segment Detection

Paper
Add Code

Towards Purely Unsupervised Disentanglement of Appearance and Shape for Person Images Generation

no code implementations • 26 Jul 2020 • Hongtao Yang, Tong Zhang, Wenbing Huang, Xuming He, Fatih Porikli

To be clear, in this paper, we refer unsupervised learning as learning without task-specific human annotations, pairs or any form of weak supervision.)

Disentanglement

Paper
Add Code

Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

4 code implementations • 21 Jul 2020 • Shuailin Li, Chuyu Zhang, Xuming He

Semi-supervised learning has attracted much attention in medical image segmentation due to challenges in acquiring pixel-wise image annotations, which is a crucial step for building high-performance deep learning methods.

3D Semantic Segmentation Image Segmentation +3

272

Paper
Code

Part-aware Prototype Network for Few-shot Semantic Segmentation

2 code implementations • ECCV 2020 • Yongfei Liu, Xiangyi Zhang, Songyang Zhang, Xuming He

In this paper, we propose a novel few-shot semantic segmentation framework based on the prototype representation.

Ranked #3 on Few-Shot Semantic Segmentation on Pascal5i

Few-Shot Semantic Segmentation Object +2

120

Paper
Code

Learning Context-aware Task Reasoning for Efficient Meta-reinforcement Learning

no code implementations • 3 Mar 2020 • Haozhe Wang, Jiale Zhou, Xuming He

Despite recent success of deep network-based Reinforcement Learning (RL), it remains elusive to achieve human-level efficiency in learning novel tasks.

Meta-Learning Meta Reinforcement Learning +2

Paper
Add Code

Learning a Layout Transfer Network for Context Aware Object Detection

no code implementations • 9 Dec 2019 • Tao Wang, Xuming He, Yuanzheng Cai, Guobao Xiao

We present a context aware object detection method based on a retrieve-and-transform scene layout model.

Autonomous Driving Object +2

Paper
Add Code

Learning Cross-modal Context Graph for Visual Grounding

2 code implementations • 20 Nov 2019 • Yongfei Liu, Bo Wan, Xiaodan Zhu, Xuming He

To address their limitations, this paper proposes a language-guided graph representation to capture the global context of grounding entities and their relations, and develop a cross-modal graph matching strategy for the multiple-phrase visual grounding task.

Graph Matching Visual Grounding

Paper
Code

Pose-aware Multi-level Feature Network for Human Object Interaction Detection

1 code implementation • ICCV 2019 • Bo Wan, Desen Zhou, Yongfei Liu, Rongjie Li, Xuming He

Reasoning human object interactions is a core problem in human-centric scene understanding and detecting such relations poses a unique challenge to vision systems due to large variations in human-object configurations, multiple co-occurring relation instances and subtle visual difference between relation categories.

Human-Object Interaction Detection Object +2

Paper
Code

Dynamic Context Correspondence Network for Semantic Alignment

1 code implementation • ICCV 2019 • Shuaiyi Huang, Qiuyue Wang, Songyang Zhang, Shipeng Yan, Xuming He

We instantiate our strategy by designing an end-to-end learnable deep network, named as Dynamic Context Correspondence Network (DCCNet).

Semantic correspondence Weakly-supervised Learning

Paper
Code

LatentGNN: Learning Efficient Non-local Relations for Visual Recognition

1 code implementation • 28 May 2019 • Songyang Zhang, Shipeng Yan, Xuming He

A promising strategy is to model the feature context by a fully-connected graph neural network (GNN), which augments traditional convolutional features with an estimated non-local context representation.

Paper
Code

Fixed-price Diffusion Mechanism Design

no code implementations • 14 May 2019 • Tianyi Zhang, Dengji Zhao, Wen Zhang, Xuming He

We consider a fixed-price mechanism design setting where a seller sells one item via a social network, but the seller can only directly communicate with her neighbours initially.

Paper
Add Code

One-Shot Action Localization by Learning Sequence Matching Network

no code implementations • CVPR 2018 • Hongtao Yang, Xuming He, Fatih Porikli

Learning based temporal action localization methods require vast amounts of training data.

Action Detection One-Shot Learning +1

Paper
Add Code

SemStyle: Learning to Generate Stylised Image Captions using Unaligned Text

1 code implementation • CVPR 2018 • Alexander Mathews, Lexing Xie, Xuming He

We develop a model that learns to generate visually relevant styled captions from a large corpus of styled text without aligned images.

Descriptive Image Captioning +1

Paper
Code

Simplifying Sentences with Sequence to Sequence Models

no code implementations • 15 May 2018 • Alexander Mathews, Lexing Xie, Xuming He

We simplify sentences with an attentive neural network sequence to sequence model, dubbed S4.

Style Transfer Text Generation +1

Paper
Add Code

Geometry-aware Deep Network for Single-Image Novel View Synthesis

no code implementations • CVPR 2018 • Miaomiao Liu, Xuming He, Mathieu Salzmann

By contrast, in this paper, we propose to exploit the 3D geometry of the scene to synthesize a novel view.

Novel View Synthesis

Paper
Add Code

Deep Free-Form Deformation Network for Object-Mask Registration

no code implementations • ICCV 2017 • Haoyang Zhang, Xuming He

In this work, we take a transformation based approach that predicts a 2D non-rigid spatial transform and warps the shape mask onto the target object.

Object Semantic Segmentation

Paper
Add Code

Indoor Scene Parsing With Instance Segmentation, Semantic Labeling and Support Relationship Inference

no code implementations • CVPR 2017 • Wei Zhuo, Mathieu Salzmann, Xuming He, Miaomiao Liu

In particular, while some of them aim at segmenting the image into regions, such as object or surface instances, others aim at inferring the semantic labels of given regions, or their support relationships.

Instance Segmentation Scene Parsing +1

Paper
Add Code

Predicting Salient Face in Multiple-Face Videos

1 code implementation • CVPR 2017 • Yufan Liu, Songyang Zhang, Mai Xu, Xuming He

On the other hand, we find that the attention of different subjects consistently focuses on a single face in each frame of videos involving multiple faces.

Saliency Prediction

Paper
Code

Boundary-aware Instance Segmentation

no code implementations • CVPR 2017 • Zeeshan Hayder, Xuming He, Mathieu Salzmann

In this context, existing methods typically propose candidate objects, usually as bounding boxes, and directly predict a binary mask within each such proposal.

Instance Segmentation Object +3

Paper
Add Code

Learning Dynamic Hierarchical Models for Anytime Scene Labeling

no code implementations • 11 Aug 2016 • Buyu Liu, Xuming He

With increasing demand for efficient image and video analysis, test-time cost of scene parsing becomes critical for many large-scale or time-sensitive vision applications.

Model Selection Representation Learning +2

Paper
Add Code

Learning deep structured network for weakly supervised change detection

no code implementations • 7 Jun 2016 • Salman H. Khan, Xuming He, Fatih Porikli, Mohammed Bennamoun, Ferdous Sohel, Roberto Togneri

We apply a constrained mean-field algorithm to estimate the pixel-level labels, and use the estimated labels to update the parameters of the CNN in an iterative EM framework.

Change Detection

Paper
Add Code

Learning to Co-Generate Object Proposals With a Deep Structured Network

no code implementations • CVPR 2016 • Zeeshan Hayder, Xuming He, Mathieu Salzmann

In particular, we introduce a deep structured network that jointly predicts the objectness scores and the bounding box locations of multiple object candidates.

Object object-detection +2

Paper
Add Code

Semantic-Aware Depth Super-Resolution in Outdoor Scenes

no code implementations • 31 May 2016 • Miaomiao Liu, Mathieu Salzmann, Xuming He

Despite much progress, state-of-the-art techniques suffer from two drawbacks: (i) they rely on the assumption that intensity edges coincide with depth discontinuities, which, unfortunately, is only true in controlled environments; and (ii) they typically exploit the availability of high-resolution training depth maps, which can often not be acquired in practice due to the sensors' limitations.

Super-Resolution

Paper
Add Code

Structural Kernel Learning for Large Scale Multiclass Object Co-Detection

no code implementations • ICCV 2015 • Zeeshan Hayder, Xuming He, Mathieu Salzmann

To exploit the correlations between objects, we build a fully-connected CRF on the candidates, which explicitly incorporates both geometric layout relations across object classes and similarity relations across multiple images.

Object object-detection +1

Paper
Add Code

Structured Depth Prediction in Challenging Monocular Video Sequences

no code implementations • 19 Nov 2015 • Miaomiao Liu, Mathieu Salzmann, Xuming He

To this end, we first study the problem of depth estimation from a single image.

Depth Prediction Monocular Depth Estimation +1

Paper
Add Code

SentiCap: Generating Image Descriptions with Sentiments

no code implementations • 6 Oct 2015 • Alexander Mathews, Lexing Xie, Xuming He

We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments.

Decision Making Descriptive +2

Paper
Add Code

Multiclass Semantic Video Segmentation With Object-Level Active Inference

no code implementations • CVPR 2015 • Buyu Liu, Xuming He

To scale up our method, we adopt an active inference strategy to improve the efficiency, which adaptively selects object subgraphs in the object-augmented dense CRF.

Object Segmentation +3

Paper
Add Code

Separating Objects and Clutter in Indoor Scenes

no code implementations • CVPR 2015 • Salman H. Khan, Xuming He, Mohammed Bennamoun, Ferdous Sohel, Roberto Togneri

Objects' spatial layout estimation and clutter identification are two important tasks to understand indoor scenes.

Paper
Add Code

Indoor Scene Structure Analysis for Single Image Depth Estimation

no code implementations • CVPR 2015 • Wei Zhuo, Mathieu Salzmann, Xuming He, Miaomiao Liu

We tackle the problem of single image depth estimation, which, without additional knowledge, suffers from many ambiguities.

Depth Estimation

Paper
Add Code

An Exemplar-based CRF for Multi-instance Object Segmentation

no code implementations • CVPR 2014 • Xuming He, Stephen Gould

We address the problem of joint detection and segmentation of multiple object instances in an image, a key step towards scene understanding.

Instance Segmentation Object +3

Paper
Add Code

Discrete-Continuous Depth Estimation from a Single Image

no code implementations • CVPR 2014 • Miaomiao Liu, Mathieu Salzmann, Xuming He

The unary potentials in this graphical model are computed by making use of the images with known depth.

Monocular Depth Estimation Superpixels

Paper
Add Code

Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning

no code implementations • CVPR 2013 • Tao Wang, Xuming He, Nick Barnes

We propose a structured Hough voting method for detecting objects with heavy occlusion in indoor environments.

Object object-detection +1

Paper
Add Code

Winding Number for Region-Boundary Consistent Salient Contour Extraction

no code implementations • CVPR 2013 • Yansheng Ming, Hongdong Li, Xuming He

The main focus is given to how to maintain the consistency (compatibility) between the region cues and the boundary cues.

Boundary Detection Segmentation

Paper
Add Code

A unified model of short-range and long-range motion perception

no code implementations • NeurIPS 2010 • Shuang Wu, Xuming He, Hongjing Lu, Alan L. Yuille

The human vision system is able to effortlessly perceive both short-range and long-range motion patterns in complex dynamic scenes.

Paper
Add Code

Learning Hybrid Models for Image Annotation with Partially Labeled Data

no code implementations • NeurIPS 2008 • Xuming He, Richard S. Zemel

Extensive labeled data for image annotation systems, which learn to assign class labels to image regions, is difficult to obtain.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.