Search Results for author: Xiaojuan Qi

Found 113 papers, 70 papers with code

CN: Channel Normalization For Point Cloud Recognition

no code implementations • ECCV 2020 • Zetong Yang, Yanan sun, Shu Liu, Xiaojuan Qi, Jiaya Jia

In 3D recognition, to fuse multi-scale structure information, existing methods apply hierarchical frameworks stacked by multiple fusion layers for integrating current relative locations with structure information from the previous level.

Paper
Add Code

Memory Selection Network for Video Propagation

no code implementations • ECCV 2020 • Ruizheng Wu, Huaijia Lin, Xiaojuan Qi, Jiaya Jia

Video propagation is a fundamental problem in video processing where guidance frame predictions are propagated to guide predictions of the target frame.

Colorization Semantic Segmentation +3

Paper
Add Code

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

1 code implementation • 22 Apr 2024 • Wei Huang, Xudong Ma, Haotong Qin, Xingyu Zheng, Chengtao Lv, Hong Chen, Jie Luo, Xiaojuan Qi, Xianglong Liu, Michele Magno

This exploration holds the potential to unveil new insights and challenges for low-bit quantization of LLaMA3 and other forthcoming LLMs, especially in addressing performance degradation problems that suffer in LLM compression.

Paper
Code

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models

no code implementations • 19 Apr 2024 • Chuofan Ma, Yi Jiang, Jiannan Wu, Zehuan Yuan, Xiaojuan Qi

We introduce Groma, a Multimodal Large Language Model (MLLM) with grounded and fine-grained visual perception ability.

Paper
Add Code

Efficient and accurate neural field reconstruction using resistive memory

no code implementations • 15 Apr 2024 • Yifei Yu, Shaocong Wang, Woyu Zhang, Xinyuan Zhang, Xiuzhe Wu, Yangu He, Jichang Yang, Yue Zhang, Ning Lin, Bo wang, Xi Chen, Songqi Wang, Xumeng Zhang, Xiaojuan Qi, Zhongrui Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

The GE harnesses the intrinsic stochasticity of resistive memory for efficient input encoding, while the PE achieves precise weight mapping through a Hardware-Aware Quantization (HAQ) circuit.

Novel View Synthesis Quantization

Paper
Add Code

Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model

no code implementations • 8 Apr 2024 • Jichang Yang, Hegan Chen, Jia Chen, Songqi Wang, Shaocong Wang, Yifei Yu, Xi Chen, Bo wang, Xinyuan Zhang, Binbin Cui, Ning Lin, Meng Xu, Yi Li, Xiaoxin Xu, Xiaojuan Qi, Zhongrui Wang, Xumeng Zhang, Dashan Shang, Han Wang, Qi Liu, Kwang-Ting Cheng, Ming Liu

Demonstrating equivalent generative quality to the software baseline, our system achieved remarkable enhancements in generative speed for both unconditional and conditional generation tasks, by factors of 64. 8 and 156. 5, respectively.

Edge-computing

Paper
Add Code

3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting

no code implementations • 30 Mar 2024 • Xiaoyang Lyu, Yang-tian Sun, Yi-Hua Huang, Xiuzhe Wu, ZiYi Yang, Yilun Chen, Jiangmiao Pang, Xiaojuan Qi

In this paper, we present an implicit surface reconstruction method with 3D Gaussian Splatting (3DGS), namely 3DGSR, that allows for accurate 3D reconstruction with intricate details while inheriting the high efficiency and rendering quality of 3DGS.

3D Reconstruction Surface Reconstruction

Paper
Add Code

Total-Decom: Decomposed 3D Scene Reconstruction with Minimal Interaction

1 code implementation • 28 Mar 2024 • Xiaoyang Lyu, Chirui Chang, Peng Dai, Yang-tian Sun, Xiaojuan Qi

Scene reconstruction from multi-view images is a fundamental problem in computer vision and graphics.

3D Reconstruction 3D Scene Reconstruction +2

Paper
Code

Can 3D Vision-Language Models Truly Understand Natural Language?

1 code implementation • 21 Mar 2024 • Weipeng Deng, Runyu Ding, Jihan Yang, Jiahui Liu, Yijiang Li, Xiaojuan Qi, Edith Ngai

To test the language understandability of 3D-VL models, we first propose a language robustness task for systematically assessing 3D-VL models across various tasks, benchmarking their performance when presented with different language style variants.

Benchmarking

Paper
Code

DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and Depth from Monocular Videos

no code implementations • 9 Mar 2024 • Xiuzhe Wu, Xiaoyang Lyu, Qihao Huang, Yong liu, Yang Wu, Ying Shan, Xiaojuan Qi

Our system contains a depth estimation module to predict depth, and a new decomposed object-wise 3D motion (DO3D) estimation module to predict ego-motion and 3D object motion.

Depth Estimation Disentanglement +5

Paper
Add Code

Classes Are Not Equal: An Empirical Study on Image Recognition Fairness

1 code implementation • 28 Feb 2024 • Jiequan Cui, Beier Zhu, Xin Wen, Xiaojuan Qi, Bei Yu, Hanwang Zhang

Second, with the proposed concept of Model Prediction Bias, we investigate the origins of problematic representation during optimization.

Contrastive Learning Data Augmentation +3

221

Paper
Code

Spec-Gaussian: Anisotropic View-Dependent Appearance for 3D Gaussian Splatting

no code implementations • 24 Feb 2024 • ZiYi Yang, Xinyu Gao, Yangtian Sun, Yihua Huang, Xiaoyang Lyu, Wen Zhou, Shaohui Jiao, Xiaojuan Qi, Xiaogang Jin

The recent advancements in 3D Gaussian splatting (3D-GS) have not only facilitated real-time rendering through modern GPU rasterization pipelines but have also attained state-of-the-art rendering quality.

Paper
Add Code

Debiasing Text-to-Image Diffusion Models

no code implementations • 22 Feb 2024 • Ruifei He, Chuhui Xue, Haoru Tan, Wenqing Zhang, Yingchen Yu, Song Bai, Xiaojuan Qi

Despite its simplicity, we show that IDA shows efficiency and fast convergence in resolving the social bias in TTI diffusion models.

Paper
Add Code

EscherNet: A Generative Model for Scalable View Synthesis

1 code implementation • 6 Feb 2024 • Xin Kong, Shikun Liu, Xiaoyang Lyu, Marwan Taher, Xiaojuan Qi, Andrew J. Davison

We introduce EscherNet, a multi-view conditioned diffusion model for view synthesis.

3D Reconstruction Novel View Synthesis

149

Paper
Code

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

1 code implementation • 6 Feb 2024 • Wei Huang, Yangdong Liu, Haotong Qin, Ying Li, Shiming Zhang, Xianglong Liu, Michele Magno, Xiaojuan Qi

Pretrained large language models (LLMs) exhibit exceptional general language processing capabilities but come with significant demands on memory and computational resources.

Binarization Quantization

105

Paper
Code

V-IRL: Grounding Virtual Intelligence in Real Life

1 code implementation • 5 Feb 2024 • Jihan Yang, Runyu Ding, Ellis Brown, Xiaojuan Qi, Saining Xie

There is a sensory gulf between the Earth that humans inhabit and the digital realms in which modern AI agents are created.

Decision Making

245

Paper
Code

GO-NeRF: Generating Virtual Objects in Neural Radiance Fields

no code implementations • 11 Jan 2024 • Peng Dai, Feitong Tan, Xin Yu, yinda zhang, Xiaojuan Qi

To this end, we propose a new method, GO-NeRF, capable of utilizing scene context for high-quality and harmonious 3D object generation within an existing NeRF.

3D Generation Object

Paper
Add Code

UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation

no code implementations • 14 Dec 2023 • Zexiang Liu, Yangguang Li, Youtian Lin, Xin Yu, Sida Peng, Yan-Pei Cao, Xiaojuan Qi, Xiaoshui Huang, Ding Liang, Wanli Ouyang

Recent advancements in text-to-3D generation technology have significantly advanced the conversion of textual descriptions into imaginative well-geometrical and finely textured 3D objects.

3D Generation Text to 3D

Paper
Add Code

Random resistive memory-based deep extreme point learning machine for unified visual processing

no code implementations • 14 Dec 2023 • Shaocong Wang, Yizhao Gao, Yi Li, Woyu Zhang, Yifei Yu, Bo wang, Ning Lin, Hegan Chen, Yue Zhang, Yang Jiang, Dingchen Wang, Jia Chen, Peng Dai, Hao Jiang, Peng Lin, Xumeng Zhang, Xiaojuan Qi, Xiaoxin Xu, Hayden So, Zhongrui Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

Our random resistive memory-based deep extreme point learning machine may pave the way for energy-efficient and training-friendly edge AI across various data modalities and tasks.

Paper
Add Code

SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes

1 code implementation • 4 Dec 2023 • Yi-Hua Huang, Yang-tian Sun, ZiYi Yang, Xiaoyang Lyu, Yan-Pei Cao, Xiaojuan Qi

During learning, the location and number of control points are adaptively adjusted to accommodate varying motion complexities in different regions, and an ARAP loss following the principle of as rigid as possible is developed to enforce spatial continuity and local rigidity of learned motions.

Novel View Synthesis

360

Paper
Code

Pruning random resistive memory for optimizing analogue AI

no code implementations • 13 Nov 2023 • Yi Li, Songqi Wang, Yaping Zhao, Shaocong Wang, Woyu Zhang, Yangu He, Ning Lin, Binbin Cui, Xi Chen, Shiming Zhang, Hao Jiang, Peng Lin, Xumeng Zhang, Xiaojuan Qi, Zhongrui Wang, Xiaoxin Xu, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning to optimize the topology of a randomly weighted analogue resistive memory neural network.

Audio Classification Image Segmentation +1

Paper
Add Code

EXIM: A Hybrid Explicit-Implicit Representation for Text-Guided 3D Shape Generation

1 code implementation • 3 Nov 2023 • Zhengzhe Liu, Jingyu Hu, Ka-Hei Hui, Xiaojuan Qi, Daniel Cohen-Or, Chi-Wing Fu

This paper presents a new text-guided technique for generating 3D shapes.

3D Shape Generation 3D Shape Representation

Paper
Code

Text-to-3D with Classifier Score Distillation

no code implementations • 30 Oct 2023 • Xin Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Song-Hai Zhang, Xiaojuan Qi

In this paper, we re-evaluate the role of classifier-free guidance in score distillation and discover a surprising finding: the guidance alone is enough for effective text-to-3D generation tasks.

3D Generation Text to 3D +1

Paper
Add Code

CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection

1 code implementation • NeurIPS 2023 • Chuofan Ma, Yi Jiang, Xin Wen, Zehuan Yuan, Xiaojuan Qi

CoDet then leverages visual similarities to discover the co-occurring objects and align them with the shared concept.

Ranked #2 on Open Vocabulary Object Detection on LVIS v1.0 (using extra training data)

Object object-detection +3

Paper
Code

SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features

no code implementations • 29 Sep 2023 • Song Wang, Zhu Wang, Can Li, Xiaojuan Qi, Hayden Kwok-Hay So

In comparison to conventional RGB cameras, the superior temporal resolution of event cameras allows them to capture rich information between frames, making them prime candidates for object tracking.

Multi-Object Tracking Object

Paper
Add Code

Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video

1 code implementation • ICCV 2023 • Xiuzhe Wu, Pengfei Hu, Yang Wu, Xiaoyang Lyu, Yan-Pei Cao, Ying Shan, Wenming Yang, Zhongqian Sun, Xiaojuan Qi

Therefore, directly learning a mapping function from speech to the entire head image is prone to ambiguity, particularly when using a short video for training.

Image Generation

Paper
Code

Texture Generation on 3D Meshes with Point-UV Diffusion

no code implementations • ICCV 2023 • Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Zhengzhe Liu, Xiaojuan Qi

In this work, we focus on synthesizing high-quality textures on 3D meshes.

Denoising Texture Synthesis

Paper
Add Code

Lowis3D: Language-Driven Open-World Instance-Level 3D Scene Understanding

no code implementations • 1 Aug 2023 • Runyu Ding, Jihan Yang, Chuhui Xue, Wenqing Zhang, Song Bai, Xiaojuan Qi

To address this challenge, we propose to harness pre-trained vision-language (VL) foundation models that encode extensive knowledge from image-text pairs to generate captions for multi-view images of 3D scenes.

Ranked #3 on 3D Open-Vocabulary Instance Segmentation on S3DIS

3D Open-Vocabulary Instance Segmentation Instance Segmentation +4

Paper
Add Code

MarS3D: A Plug-and-Play Motion-Aware Model for Semantic Segmentation on Multi-Scan 3D Point Clouds

1 code implementation • CVPR 2023 • Jiahui Liu, Chirui Chang, Jianhui Liu, Xiaoyang Wu, Lan Ma, Xiaojuan Qi

Unlike the single-scan-based semantic segmentation task, this task requires distinguishing the motion states of points in addition to their semantic categories.

3D Semantic Segmentation Representation Learning +1

Paper
Code

Decoupled Kullback-Leibler Divergence Loss

4 code implementations • 23 May 2023 • Jiequan Cui, Zhuotao Tian, Zhisheng Zhong, Xiaojuan Qi, Bei Yu, Hanwang Zhang

In this paper, we delve deeper into the Kullback-Leibler (KL) Divergence loss and observe that it is equivalent to the Doupled Kullback-Leibler (DKL) Divergence loss that consists of 1) a weighted Mean Square Error (wMSE) loss and 2) a Cross-Entropy loss incorporating soft labels.

Adversarial Defense Adversarial Robustness +1

Paper
Code

Hybrid Neural Rendering for Large-Scale Scenes with Motion Blur

1 code implementation • CVPR 2023 • Peng Dai, yinda zhang, Xin Yu, Xiaoyang Lyu, Xiaojuan Qi

Rendering novel view images is highly desirable for many applications.

Neural Rendering Novel View Synthesis

Paper
Code

RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding

no code implementations • 3 Apr 2023 • Jihan Yang, Runyu Ding, Zhe Wang, Xiaojuan Qi

Existing 3D scene understanding tasks have achieved high performance on close-set benchmarks but fail to handle novel categories in real-world applications.

Contrastive Learning Instance Segmentation +2

Paper
Add Code

Context-Aware Transformer for 3D Point Cloud Automatic Annotation

no code implementations • 27 Mar 2023 • Xiaoyan Qian, Chang Liu, Xiaojuan Qi, Siew-Chong Tan, Edmund Lam, Ngai Wong

3D automatic annotation has received increased attention since manually annotating 3D point clouds is laborious.

Object

Paper
Add Code

You Only Need One Thing One Click: Self-Training for Weakly Supervised 3D Scene Understanding

1 code implementation • 26 Mar 2023 • Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu

3D scene understanding, e. g., point cloud semantic and instance segmentation, often requires large-scale annotated training data, but clearly, point-wise labels are too tedious to prepare.

3D Instance Segmentation Pseudo Label +4

Paper
Code

DreamStone: Image as Stepping Stone for Text-Guided 3D Shape Generation

2 code implementations • 24 Mar 2023 • Zhengzhe Liu, Peng Dai, Ruihui Li, Xiaojuan Qi, Chi-Wing Fu

The core of our approach is a two-stage feature-space alignment strategy that leverages a pre-trained single-view reconstruction (SVR) model to map CLIP features to shapes: to begin with, map the CLIP image feature to the detail-rich 3D shape space of the SVR model, then map the CLIP text feature to the 3D shape space through encouraging the CLIP-consistency between rendered images and the input text.

3D Shape Generation

Paper
Code

IST-Net: Prior-free Category-level Pose Estimation with Implicit Space Transformation

1 code implementation • ICCV 2023 • Jianhui Liu, Yukang Chen, Xiaoqing Ye, Xiaojuan Qi

Category-level 6D pose estimation aims to predict the poses and sizes of unseen objects from a specific category.

6D Pose Estimation

102

Paper
Code

Learning Context-aware Classifier for Semantic Segmentation

2 code implementations • 21 Mar 2023 • Zhuotao Tian, Jiequan Cui, Li Jiang, Xiaojuan Qi, Xin Lai, Yixin Chen, Shu Liu, Jiaya Jia

Semantic segmentation is still a challenging task for parsing diverse contexts in different scenes, thus the fixed classifier might not be able to well address varying feature distributions during testing.

Segmentation Semantic Segmentation

1,119

Paper
Code

VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking

2 code implementations • CVPR 2023 • Yukang Chen, Jianhui Liu, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia

Our core insight is to predict objects directly based on sparse voxel features, without relying on hand-crafted proxies.

Ranked #1 on 3D Object Detection on Argoverse2

3D Object Detection Object +1

642

Paper
Code

Learning a Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation

1 code implementation • ICCV 2023 • Xiaoyang Lyu, Peng Dai, Zizhang Li, Dongyu Yan, Yi Lin, Yifan Peng, Xiaojuan Qi

We found that the color rendering loss results in optimization bias against low-intensity areas, causing gradient vanishing and leaving these areas unoptimized.

Neural Rendering Surface Reconstruction

Paper
Code

Understanding Imbalanced Semantic Segmentation Through Neural Collapse

2 code implementations • CVPR 2023 • Zhisheng Zhong, Jiequan Cui, Yibo Yang, Xiaoyang Wu, Xiaojuan Qi, Xiangyu Zhang, Jiaya Jia

Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers.

3D Semantic Segmentation Segmentation

1,119

Paper
Code

Command-Driven Articulated Object Understanding and Manipulation

no code implementations • CVPR 2023 • Ruihang Chu, Zhengzhe Liu, Xiaoqing Ye, Xiao Tan, Xiaojuan Qi, Chi-Wing Fu, Jiaya Jia

The key of Cart is to utilize the prediction of object structures to connect visual observations with user commands for effective manipulations.

motion prediction Object +1

Paper
Add Code

Vertical Layering of Quantized Neural Networks for Heterogeneous Inference

no code implementations • 10 Dec 2022 • Hai Wu, Ruifei He, Haoru Tan, Xiaojuan Qi, Kaibin Huang

Experiments show that the proposed vertical-layered representation and developed once QAT scheme are effective in embodying multiple quantized networks into a single one and allow one-time training, and it delivers comparable performance as that of quantized models tailored to any specific bit-width.

Quantization

Paper
Add Code

PLA: Language-Driven Open-Vocabulary 3D Scene Understanding

1 code implementation • CVPR 2023 • Runyu Ding, Jihan Yang, Chuhui Xue, Wenqing Zhang, Song Bai, Xiaojuan Qi

Open-vocabulary scene understanding aims to localize and recognize unseen categories beyond the annotated label space.

Ranked #2 on 3D Open-Vocabulary Instance Segmentation on S3DIS

3D Open-Vocabulary Instance Segmentation Contrastive Learning +4

201

Paper
Code

MGFN: Magnitude-Contrastive Glance-and-Focus Network for Weakly-Supervised Video Anomaly Detection

1 code implementation • 28 Nov 2022 • Yingxian Chen, Zhengzhe Liu, Baoheng Zhang, Wilton Fok, Xiaojuan Qi, Yik-Chung Wu

Weakly supervised detection of anomalies in surveillance videos is a challenging task.

Ranked #2 on Anomaly Detection In Surveillance Videos on UCF-Crime

Anomaly Detection In Surveillance Videos Video Anomaly Detection

Paper
Code

Parametric Classification for Generalized Category Discovery: A Baseline Study

2 code implementations • ICCV 2023 • Xin Wen, Bingchen Zhao, Xiaojuan Qi

Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.

Ranked #1 on Open-World Semi-Supervised Learning on ImageNet-100

Classification Novel Class Discovery +2

Paper
Code

SL3D: Self-supervised-Self-labeled 3D Recognition

1 code implementation • 30 Oct 2022 • Fernando Julio Cendra, Lan Ma, Jiajun Shen, Xiaojuan Qi

SL3D is a generic framework and can be applied to solve different 3D recognition tasks, including classification, object detection, and semantic segmentation.

Ranked #2 on Unsupervised 3D Semantic Segmentation on ScanNetV2

Clustering Object +5

Paper
Code

Is synthetic data from generative models ready for image recognition?

1 code implementation • 14 Oct 2022 • Ruifei He, Shuyang Sun, Xin Yu, Chuhui Xue, Wenqing Zhang, Philip Torr, Song Bai, Xiaojuan Qi

Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images.

Text-to-Image Generation Transfer Learning

162

Paper
Code

Prototypical VoteNet for Few-Shot 3D Point Cloud Object Detection

1 code implementation • 11 Oct 2022 • Shizhen Zhao, Xiaojuan Qi

Most existing 3D point cloud object detection approaches heavily rely on large amounts of labeled training data.

Object object-detection +2

Paper
Code

In-situ Model Downloading to Realize Versatile Edge AI in 6G Mobile Networks

no code implementations • 7 Oct 2022 • Kaibin Huang, Hai Wu, Zhiyan Liu, Xiaojuan Qi

We further propose a virtualized 6G network architecture customized for deploying in-situ model downloading with the key feature of a three-tier (edge, local, and central) AI library.

Paper
Add Code

Spatial Pruned Sparse Convolution for Efficient 3D Object Detection

no code implementations • 28 Sep 2022 • Jianhui Liu, Yukang Chen, Xiaoqing Ye, Zhuotao Tian, Xiao Tan, Xiaojuan Qi

3D scenes are dominated by a large number of background points, which is redundant for the detection task that mainly needs to focus on foreground objects.

3D Object Detection Object +1

Paper
Add Code

Rethinking Resolution in the Context of Efficient Video Recognition

1 code implementation • 26 Sep 2022 • Chuofan Ma, Qiushan Guo, Yi Jiang, Zehuan Yuan, Ping Luo, Xiaojuan Qi

Our key finding is that the major cause of degradation is not information loss in the down-sampling process, but rather the mismatch between network architecture and input scale.

Knowledge Distillation Video Recognition

Paper
Code

ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation

2 code implementations • 9 Sep 2022 • Zhengzhe Liu, Peng Dai, Ruihui Li, Xiaojuan Qi, Chi-Wing Fu

Text-guided 3D shape generation remains challenging due to the absence of large paired text-shape data, the substantial semantic gap between these two modalities, and the structural complexity of 3D shapes.

3D Shape Generation

Paper
Code

Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoireing

1 code implementation • 20 Jul 2022 • Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Jiajun Shen, Jia Li, Xiaojuan Qi

With the rapid development of mobile devices, modern widely-used mobile phones typically allow users to capture 4K resolution (i. e., ultra-high-definition) images.

Ranked #1 on Image Restoration on UHDM

4k Image Enhancement +2

179

Paper
Code

Multimodal Transformer for Automatic 3D Annotation and Object Detection

1 code implementation • 20 Jul 2022 • Chang Liu, Xiaoyan Qian, Binxiao Huang, Xiaojuan Qi, Edmund Lam, Siew-Chong Tan, Ngai Wong

By enriching the sparse point clouds, our method achieves 4. 48\% and 4. 03\% better 3D AP on KITTI moderate and hard samples, respectively, versus the state-of-the-art autolabeler.

3D Object Detection Object +1

Paper
Code

LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs

2 code implementations • CVPR 2023 • Yukang Chen, Jianhui Liu, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia

Recent advance in 2D CNNs has revealed that large kernels are important.

3D Object Detection Object +3

358

Paper
Code

Unifying Voxel-based Representation with Transformer for 3D Object Detection

1 code implementation • 1 Jun 2022 • Yanwei Li, Yilun Chen, Xiaojuan Qi, Zeming Li, Jian Sun, Jiaya Jia

To this end, the modality-specific space is first designed to represent different inputs in the voxel feature space.

3D Object Detection Object +3

214

Paper
Code

Voxel Field Fusion for 3D Object Detection

1 code implementation • CVPR 2022 • Yanwei Li, Xiaojuan Qi, Yukang Chen, LiWei Wang, Zeming Li, Jian Sun, Jiaya Jia

In this work, we present a conceptually simple yet effective framework for cross-modality 3D object detection, named voxel field fusion.

3D Object Detection Data Augmentation +2

Paper
Code

Towards Efficient 3D Object Detection with Knowledge Distillation

1 code implementation • 30 May 2022 • Jihan Yang, Shaoshuai Shi, Runyu Ding, Zhe Wang, Xiaojuan Qi

Then, we build a benchmark to assess existing KD methods developed in the 2D domain for 3D object detection upon six well-constructed teacher-student pairs.

3D Object Detection Knowledge Distillation +3

106

Paper
Code

Self-Supervised Visual Representation Learning with Semantic Grouping

1 code implementation • 30 May 2022 • Xin Wen, Bingchen Zhao, Anlin Zheng, Xiangyu Zhang, Xiaojuan Qi

The semantic grouping is performed by assigning pixels to a set of learnable prototypes, which can adapt to each sample by attentive pooling over the feature and form new slots.

Ranked #15 on Unsupervised Semantic Segmentation on COCO-Stuff-27 (Accuracy metric)

Contrastive Learning Instance Segmentation +6

Paper
Code

Video Demoireing with Relation-Based Temporal Consistency

1 code implementation • CVPR 2022 • Peng Dai, Xin Yu, Lan Ma, Baoheng Zhang, Jia Li, Wenbo Li, Jiajun Shen, Xiaojuan Qi

Moire patterns, appearing as color distortions, severely degrade image and video qualities when filming a screen with digital cameras.

Relation

Paper
Code

DODA: Data-oriented Sim-to-Real Domain Adaptation for 3D Semantic Segmentation

1 code implementation • 4 Apr 2022 • Runyu Ding, Jihan Yang, Li Jiang, Xiaojuan Qi

Deep learning approaches achieve prominent success in 3D semantic segmentation.

3D Semantic Segmentation Segmentation +1

Paper
Code

MAP-Gen: An Automated 3D-Box Annotation Flow with Multimodal Attention Point Generator

no code implementations • 29 Mar 2022 • Chang Liu, Xiaoyan Qian, Xiaojuan Qi, Edmund Y. Lam, Siew-Chong Tan, Ngai Wong

While a few previous studies tried to automatically generate 3D bounding boxes from weak labels such as 2D boxes, the quality is sub-optimal compared to human annotators.

object-detection Object Detection

Paper
Add Code

Stratified Transformer for 3D Point Cloud Segmentation

4 code implementations • CVPR 2022 • Xin Lai, Jianhui Liu, Li Jiang, LiWei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia

In this paper, we propose Stratified Transformer that is able to capture long-range contexts and demonstrates strong generalization ability and high performance.

Ranked #14 on Semantic Segmentation on ScanNet

Point Cloud Segmentation Position +1

1,119

Paper
Code

Towards Implicit Text-Guided 3D Shape Generation

1 code implementation • CVPR 2022 • Zhengzhe Liu, Yi Wang, Xiaojuan Qi, Chi-Wing Fu

In this work, we explore the challenging task of generating 3D shapes from text.

3D Shape Generation

Paper
Code

HINT: Hierarchical Neuron Concept Explainer

1 code implementation • CVPR 2022 • Andong Wang, Wei-Ning Lee, Xiaojuan Qi

To this end, we propose HIerarchical Neuron concepT explainer (HINT) to effectively build bidirectional associations between neurons and hierarchical concepts in a low-cost and scalable manner.

Weakly-Supervised Object Localization

Paper
Code

Progressive End-to-End Object Detection in Crowded Scenes

2 code implementations • CVPR 2022 • Anlin Zheng, Yuang Zhang, Xiangyu Zhang, Xiaojuan Qi, Jian Sun

Experiments show that our method can significantly boost the performance of query-based detectors in crowded scenes.

Ranked #1 on Object Detection on CrowdHuman

Object object-detection +1

Paper
Code

Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability

1 code implementation • CVPR 2022 • Ruifei He, Shuyang Sun, Jihan Yang, Song Bai, Xiaojuan Qi

Large-scale pre-training has been proven to be crucial for various computer vision tasks.

Knowledge Distillation

Paper
Code

TWIST: Two-Way Inter-Label Self-Training for Semi-Supervised 3D Instance Segmentation

no code implementations • CVPR 2022 • Ruihang Chu, Xiaoqing Ye, Zhengzhe Liu, Xiao Tan, Xiaojuan Qi, Chi-Wing Fu, Jiaya Jia

We explore the way to alleviate the label-hungry problem in a semi-supervised setting for 3D instance segmentation.

3D Instance Segmentation Denoising +2

Paper
Add Code

Recursive Least-Squares Estimator-Aided Online Learning for Visual Tracking

2 code implementations • CVPR 2020 • Jin Gao, Yan Lu, Xiaojuan Qi, Yutong Kou, Bing Li, Liang Li, Shan Yu, Weiming Hu

In this paper, we propose a simple yet effective recursive least-squares estimator-aided online learning approach for few-shot online adaptation without requiring offline training.

Continual Learning One-Shot Learning +1

Paper
Code

Slot-VPS: Object-centric Representation Learning for Video Panoptic Segmentation

no code implementations • CVPR 2022 • Yi Zhou, HUI ZHANG, Hana Lee, Shuyang Sun, Pingjun Li, Yangguang Zhu, ByungIn Yoo, Xiaojuan Qi, Jae-Joon Han

We encode all panoptic entities in a video, including both foreground instances and background semantics, with a unified representation called panoptic slots.

Object Representation Learning +1

Paper
Add Code

Fully Convolutional Networks for Panoptic Segmentation with Point-based Supervision

1 code implementation • 17 Aug 2021 • Yanwei Li, Hengshuang Zhao, Xiaojuan Qi, Yukang Chen, Lu Qi, LiWei Wang, Zeming Li, Jian Sun, Jiaya Jia

In particular, Panoptic FCN encodes each object instance or stuff category with the proposed kernel generator and produces the prediction by convolving the high-resolution feature directly.

Panoptic Segmentation Segmentation +1

388

Paper
Code

ST3D++: Denoised Self-training for Unsupervised Domain Adaptation on 3D Object Detection

no code implementations • 15 Aug 2021 • Jihan Yang, Shaoshuai Shi, Zhe Wang, Hongsheng Li, Xiaojuan Qi

These specific designs enable the detector to be trained on meticulously refined pseudo labeled target data with denoised training signals, and thus effectively facilitate adapting an object detector to a target domain without requiring annotations.

3D Object Detection Data Augmentation +5

Paper
Add Code

Multilevel Knowledge Transfer for Cross-Domain Object Detection

no code implementations • 2 Aug 2021 • Botos Csaba, Xiaojuan Qi, Arslan Chaudhry, Puneet Dokania, Philip Torr

The key ingredients to our approach are -- (a) mapping the source to the target domain on pixel-level; (b) training a teacher network on the mapped source and the unannotated target domain using adversarial feature alignment; and (c) finally training a student network using the pseudo-labels obtained from the teacher.

Object object-detection +2

Paper
Add Code

Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation

1 code implementation • ICCV 2021 • Ruifei He, Jihan Yang, Xiaojuan Qi

In this paper, we present a simple and yet effective Distribution Alignment and Random Sampling (DARS) method to produce unbiased pseudo labels that match the true class distribution estimated from the labeled data.

Data Augmentation Segmentation +1

Paper
Code

3D-to-2D Distillation for Indoor Scene Parsing

1 code implementation • CVPR 2021 • Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu

First, we distill 3D knowledge from a pretrained 3D network to supervise a 2D network to learn simulated 3D features from 2D features during the training, so the 2D network can infer without requiring 3D data.

Scene Parsing Semantic Parsing +1

Paper
Code

One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation

2 code implementations • CVPR 2021 • Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu

Point cloud semantic segmentation often requires largescale annotated training data, but clearly, point-wise labels are too tedious to prepare.

3D Semantic Segmentation Relation Network +1

Paper
Code

PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds

2 code implementations • CVPR 2021 • Mutian Xu, Runyu Ding, Hengshuang Zhao, Xiaojuan Qi

The key of PAConv is to construct the convolution kernel by dynamically assembling basic weight matrices stored in Weight Bank, where the coefficients of these weight matrices are self-adaptively learned from point positions through ScoreNet.

Ranked #2 on Point Cloud Segmentation on PointCloud-C

3D Point Cloud Classification Point Cloud Classification +2

534

Paper
Code

AET-EFN: A Versatile Design for Static and Dynamic Event-Based Vision

no code implementations • 22 Mar 2021 • Chang Liu, Xiaojuan Qi, Edmund Lam, Ngai Wong

The neuromorphic event cameras, which capture the optical changes of a scene, have drawn increasing attention due to their high speed and low power consumption.

Event-based vision

Paper
Add Code

ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection

1 code implementation • CVPR 2021 • Jihan Yang, Shaoshuai Shi, Zhe Wang, Hongsheng Li, Xiaojuan Qi

Then, the detector is iteratively improved on the target domain by alternatively conducting two steps, which are the pseudo label updating with the developed quality-aware triplet memory bank and the model training with curriculum data augmentation.

3D Object Detection Data Augmentation +4

282

Paper
Code

Aggregation With Feature Detection

no code implementations • ICCV 2021 • Shuyang Sun, Xiaoyu Yue, Xiaojuan Qi, Wanli Ouyang, Victor Adrian Prisacariu, Philip H.S. Torr

Aggregating features from different depths of a network is widely adopted to improve the network capability.

Instance Segmentation object-detection +2

Paper
Add Code

Learning Geometry-Disentangled Representation for Complementary Understanding of 3D Object Point Cloud

3 code implementations • 20 Dec 2020 • Mutian Xu, Junhao Zhang, Zhipeng Zhou, Mingye Xu, Xiaojuan Qi, Yu Qiao

GDANet introduces Geometry-Disentangle Module to dynamically disentangle point clouds into the contour and flat part of 3D objects, respectively denoted by sharp and gentle variation components.

Ranked #1 on Point Cloud Segmentation on PointCloud-C

3D Object Classification 3D Part Segmentation +2

Paper
Code

GeoNet++: Iterative Geometric Neural Network with Edge-Aware Refinement for Joint Depth and Surface Normal Estimation

2 code implementations • 13 Dec 2020 • Xiaojuan Qi, Zhengzhe Liu, Renjie Liao, Philip H. S. Torr, Raquel Urtasun, Jiaya Jia

Note that GeoNet++ is generic and can be used in other depth/normal prediction frameworks to improve the quality of 3D reconstruction and pixel-wise accuracy of depth and surface normals.

3D Reconstruction Depth Estimation +2

119

Paper
Code

Fully Convolutional Networks for Panoptic Segmentation

6 code implementations • CVPR 2021 • Yanwei Li, Hengshuang Zhao, Xiaojuan Qi, LiWei Wang, Zeming Li, Jian Sun, Jiaya Jia

In this paper, we present a conceptually simple, strong, and efficient framework for panoptic segmentation, called Panoptic FCN.

Ranked #1 on Panoptic Segmentation on COCO minival (SQ metric)

Panoptic Segmentation Segmentation

388

Paper
Code

Object-aware Feature Aggregation for Video Object Detection

no code implementations • 23 Oct 2020 • Qichuan Geng, Hong Zhang, Na Jiang, Xiaojuan Qi, Liangjun Zhang, Zhong Zhou

As a consequence, augmenting features with such prior knowledge can effectively improve the classification and localization performance.

Object object-detection +2

Paper
Add Code

Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation

1 code implementation • NeurIPS 2020 • Bowen Li, Xiaojuan Qi, Philip H. S. Torr, Thomas Lukasiewicz

To achieve this, a new word-level discriminator is proposed, which provides the generator with fine-grained training feedback at word-level, to facilitate training a lightweight generator that has a small number of parameters, but can still correctly focus on specific visual attributes of an image, and then edit them without affecting other contents that are not described in the text.

Generative Adversarial Network Image Manipulation +1

Paper
Code

Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis

2 code implementations • 31 Mar 2020 • Hao Tang, Xiaojuan Qi, Guolei Sun, Dan Xu, Nicu Sebe, Radu Timofte, Luc van Gool

We propose a novel ECGAN for the challenging semantic image synthesis task.

Contrastive Learning Image Generation

Paper
Code

Image-to-Image Translation with Text Guidance

no code implementations • 12 Feb 2020 • Bowen Li, Xiaojuan Qi, Philip H. S. Torr, Thomas Lukasiewicz

The goal of this paper is to embed controllable factors, i. e., natural language descriptions, into image-to-image translation with generative adversarial networks, which allows text descriptions to determine the visual attributes of synthetic images.

Image-to-Image Translation Part-Of-Speech Tagging +1

Paper
Add Code

Global Texture Enhancement for Fake Face Detection in the Wild

1 code implementation • CVPR 2020 • Zhengzhe Liu, Xiaojuan Qi, Philip Torr

In this paper, we conduct an empirical study on fake/real faces, and have two important observations: firstly, the texture of fake faces is substantially different from real ones; secondly, global texture statistics are more robust to image editing and transferable to fake faces from different GANs and datasets.

Face Detection Fake Image Detection

Paper
Code

Gated Path Selection Network for Semantic Segmentation

no code implementations • 19 Jan 2020 • Qichuan Geng, Hong Zhang, Xiaojuan Qi, Ruigang Yang, Zhong Zhou, Gao Huang

Semantic segmentation is a challenging task that needs to handle large scale variations, deformations and different viewpoints.

Segmentation Semantic Segmentation

Paper
Add Code

Unifying Training and Inference for Panoptic Segmentation

no code implementations • CVPR 2020 • Qizhu Li, Xiaojuan Qi, Philip H. S. Torr

This panoptic submodule gives rise to a novel propagation mechanism for panoptic logits and enables the network to output a coherent panoptic segmentation map for both "stuff" and "thing" classes, without any post-processing.

Panoptic Segmentation Segmentation

Paper
Add Code

Few-shot Action Recognition with Permutation-invariant Attention

1 code implementation • ECCV 2020 • Hongguang Zhang, Li Zhang, Xiaojuan Qi, Hongdong Li, Philip H. S. Torr, Piotr Koniusz

Such encoded blocks are aggregated by permutation-invariant pooling to make our approach robust to varying action lengths and long-range temporal dependencies whose patterns are unlikely to repeat even in clips of the same class.

Ranked #6 on Few Shot Action Recognition on Kinetics-100

Few-Shot action recognition Few Shot Action Recognition +3

Paper
Code

An Adversarial Perturbation Oriented Domain Adaptation Approach for Semantic Segmentation

no code implementations • 18 Dec 2019 • Jihan Yang, Ruijia Xu, Ruiyu Li, Xiaojuan Qi, Xiaoyong Shen, Guanbin Li, Liang Lin

In contrast to adversarial alignment, we propose to explicitly train a domain-invariant classifier by generating and defensing against pointwise feature space adversarial perturbations.

Position Segmentation +2

Paper
Add Code

ManiGAN: Text-Guided Image Manipulation

3 code implementations • 12 Dec 2019 • Bowen Li, Xiaojuan Qi, Thomas Lukasiewicz, Philip H. S. Torr

The goal of our paper is to semantically edit parts of an image matching a given text that describes desired attributes (e. g., texture, colour, and background), while preserving other contents that are irrelevant to the text.

Generative Adversarial Network Image Manipulation +1

366

Paper
Code

Domain-invariant Stereo Matching Networks

1 code implementation • ECCV 2020 • Feihu Zhang, Xiaojuan Qi, Ruigang Yang, Victor Prisacariu, Benjamin Wah, Philip Torr

State-of-the-art stereo matching networks have difficulties in generalizing to new unseen environments due to significant domain differences, such as color, illumination, contrast, and texture.

Stereo Matching

225

Paper
Code

Controllable Text-to-Image Generation

2 code implementations • NeurIPS 2019 • Bowen Li, Xiaojuan Qi, Thomas Lukasiewicz, Philip H. S. Torr

In this paper, we propose a novel controllable text-to-image generative adversarial network (ControlGAN), which can effectively synthesise high-quality images and also control parts of the image generation according to natural language descriptions.

Ranked #7 on Text-to-Image Generation on Multi-Modal-CelebA-HQ

Generative Adversarial Network Text-to-Image Generation

163

Paper
Code

Improved Techniques for Training Adaptive Deep Networks

2 code implementations • ICCV 2019 • Hao Li, Hong Zhang, Xiaojuan Qi, Ruigang Yang, Gao Huang

Adaptive inference is a promising technique to improve the computational efficiency of deep models at test time.

Computational Efficiency Knowledge Distillation

Paper
Code

3D Motion Decomposition for RGBD Future Dynamic Scene Synthesis

no code implementations • CVPR 2019 • Xiaojuan Qi, Zhengzhe Liu, Qifeng Chen, Jiaya Jia

A future video is the 2D projection of a 3D scene with predicted camera and object motion.

Video Prediction

Paper
Add Code

The Liver Tumor Segmentation Benchmark (LiTS)

6 code implementations • 13 Jan 2019 • Patrick Bilic, Patrick Christ, Hongwei Bran Li, Eugene Vorontsov, Avi Ben-Cohen, Georgios Kaissis, Adi Szeskin, Colin Jacobs, Gabriel Efrain Humpire Mamani, Gabriel Chartrand, Fabian Lohöfer, Julian Walter Holch, Wieland Sommer, Felix Hofmann, Alexandre Hostettler, Naama Lev-Cohain, Michal Drozdzal, Michal Marianne Amitai, Refael Vivantik, Jacob Sosna, Ivan Ezhov, Anjany Sekuboyina, Fernando Navarro, Florian Kofler, Johannes C. Paetzold, Suprosanna Shit, Xiaobin Hu, Jana Lipková, Markus Rempfler, Marie Piraud, Jan Kirschke, Benedikt Wiestler, Zhiheng Zhang, Christian Hülsemeyer, Marcel Beetz, Florian Ettlinger, Michela Antonelli, Woong Bae, Míriam Bellver, Lei Bi, Hao Chen, Grzegorz Chlebus, Erik B. Dam, Qi Dou, Chi-Wing Fu, Bogdan Georgescu, Xavier Giró-i-Nieto, Felix Gruen, Xu Han, Pheng-Ann Heng, Jürgen Hesser, Jan Hendrik Moltz, Christian Igel, Fabian Isensee, Paul Jäger, Fucang Jia, Krishna Chaitanya Kaluva, Mahendra Khened, Ildoo Kim, Jae-Hun Kim, Sungwoong Kim, Simon Kohl, Tomasz Konopczynski, Avinash Kori, Ganapathy Krishnamurthi, Fan Li, Hongchao Li, Junbo Li, Xiaomeng Li, John Lowengrub, Jun Ma, Klaus Maier-Hein, Kevis-Kokitsi Maninis, Hans Meine, Dorit Merhof, Akshay Pai, Mathias Perslev, Jens Petersen, Jordi Pont-Tuset, Jin Qi, Xiaojuan Qi, Oliver Rippel, Karsten Roth, Ignacio Sarasua, Andrea Schenk, Zengming Shen, Jordi Torres, Christian Wachinger, Chunliang Wang, Leon Weninger, Jianrong Wu, Daguang Xu, Xiaoping Yang, Simon Chun-Ho Yu, Yading Yuan, Miao Yu, Liping Zhang, Jorge Cardoso, Spyridon Bakas, Rickmer Braren, Volker Heinemann, Christopher Pal, An Tang, Samuel Kadoury, Luc Soler, Bram van Ginneken, Hayit Greenspan, Leo Joskowicz, Bjoern Menze

In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018.

Benchmarking Computed Tomography (CT) +3

457

Paper
Code

Human Pose Estimation with Spatial Contextual Information

no code implementations • 7 Jan 2019 • Hong Zhang, Hao Ouyang, Shu Liu, Xiaojuan Qi, Xiaoyong Shen, Ruigang Yang, Jiaya Jia

With this principle, we present two conceptually simple and yet computational efficient modules, namely Cascade Prediction Fusion (CPF) and Pose Graph Neural Network (PGNN), to exploit underlying contextual information.

Ranked #10 on Pose Estimation on MPII Human Pose

Pose Estimation

Paper
Add Code

Image Inpainting via Generative Multi-column Convolutional Neural Networks

2 code implementations • NeurIPS 2018 • Yi Wang, Xin Tao, Xiaojuan Qi, Xiaoyong Shen, Jiaya Jia

In this paper, we propose a generative multi-column network for image inpainting.

Image Inpainting

417

Paper
Code

GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction

no code implementations • ECCV 2018 • Li Jiang, Shaoshuai Shi, Xiaojuan Qi, Jiaya Jia

We propose to add geometric adversarial loss (GAL).

3D Object Reconstruction

Paper
Add Code

Referring Image Segmentation via Recurrent Refinement Networks

1 code implementation • CVPR 2018 • Ruiyu Li, Kaican Li, Yi-Chun Kuo, Michelle Shu, Xiaojuan Qi, Xiaoyong Shen, Jiaya Jia

We address the problem of image segmentation from natural language descriptions.

Image Segmentation Referring Expression Segmentation +2

Paper
Code

GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation

1 code implementation • CVPR 2018 • Xiaojuan Qi, Renjie Liao, Zhengzhe Liu, Raquel Urtasun, Jiaya Jia

In this paper, we propose Geometric Neural Network (GeoNet) to jointly predict depth and surface normal maps from a single image.

Depth Estimation Surface Normal Estimation

119

Paper
Code

Semi-parametric Image Synthesis

1 code implementation • CVPR 2018 • Xiaojuan Qi, Qifeng Chen, Jiaya Jia, Vladlen Koltun

We present a semi-parametric approach to photographic image synthesis from semantic layouts.

Ranked #6 on Image-to-Image Translation on ADE20K-Outdoor Labels-to-Photos

Image-to-Image Translation Semantic Segmentation

270

Paper
Code

Semantically Consistent Image Completion with Fine-grained Details

no code implementations • 26 Nov 2017 • Pengpeng Liu, Xiaojuan Qi, Pinjia He, Yikang Li, Michael R. Lyu, Irwin King

Image completion has achieved significant progress due to advances in generative adversarial networks (GANs).

Image Inpainting

Paper
Add Code

3D Graph Neural Networks for RGBD Semantic Segmentation

2 code implementations • ICCV 2017 • Xiaojuan Qi, Renjie Liao, Jiaya Jia, Sanja Fidler, Raquel Urtasun

Each node in the graph corresponds to a set of points and is associated with a hidden representation vector initialized with an appearance feature extracted by a unary CNN from 2D images.

Ranked #30 on Semantic Segmentation on SUN-RGBD (using extra training data)

RGBD Semantic Segmentation Semantic Segmentation

227

Paper
Code

H-DenseUNet: Hybrid Densely Connected UNet for Liver and Tumor Segmentation from CT Volumes

2 code implementations • 21 Sep 2017 • Xiaomeng Li, Hao Chen, Xiaojuan Qi, Qi Dou, Chi-Wing Fu, Pheng Ann Heng

Our method outperformed other state-of-the-arts on the segmentation results of tumors and achieved very competitive performance for liver segmentation even with a single model.

Ranked #1 on Liver Segmentation on LiTS2017 (Dice metric)

Automatic Liver And Tumor Segmentation Image Segmentation +4

522

Paper
Code

ICNet for Real-Time Semantic Segmentation on High-Resolution Images

17 code implementations • ECCV 2018 • Hengshuang Zhao, Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, Jiaya Jia

We focus on the challenging task of real-time semantic segmentation in this paper.

Ranked #11 on Dichotomous Image Segmentation on DIS-TE4

Dichotomous Image Segmentation Real-Time Semantic Segmentation +3

2,917

Paper
Code

Pyramid Scene Parsing Network

67 code implementations • CVPR 2017 • Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia

Scene parsing is challenging for unrestricted open vocabulary and diverse scenes.

Ranked #4 on Video Semantic Segmentation on Cityscapes val

Dichotomous Image Segmentation Image Classification +5

76,589

Paper
Code

Multi-Scale Patch Aggregation (MPA) for Simultaneous Detection and Segmentation

no code implementations • CVPR 2016 • Shu Liu, Xiaojuan Qi, Jianping Shi, Hong Zhang, Jiaya Jia

Aiming at simultaneous detection and segmentation (SDS), we propose a proposal-free framework, which detect and segment object instances via mid-level patches.

Object Object Proposal Generation +1