1 code implementation • 3 Apr 2023 • Zhuoling Li, Chuanrui Zhang, Wei-Chiu Ma, Yipin Zhou, Linyan Huang, Haoqian Wang, SerNam Lim, Hengshuang Zhao
In recent years, transformer-based detectors have demonstrated remarkable performance in 2D visual perception tasks.
1 code implementation • CVPR 2023 • Xiaoyang Wu, Xin Wen, Xihui Liu, Hengshuang Zhao
As a pioneering work, PointContrast conducts unsupervised 3D representation learning via leveraging contrastive learning over raw RGB-D frames and proves its effectiveness on various downstream tasks.
Ranked #2 on
Semantic Segmentation
on ScanNet
(using extra training data)
1 code implementation • CVPR 2023 • Zhenyu Wang, YaLi Li, Xi Chen, Ser-Nam Lim, Antonio Torralba, Hengshuang Zhao, Shengjin Wang
In this paper, we formally address universal object detection, which aims to detect every scene and predict every category.
no code implementations • 21 Mar 2023 • Haoheng Lan, Jindong Gu, Philip Torr, Hengshuang Zhao
In this work, we explore backdoor attacks on segmentation models to misclassify all pixels of a victim class by injecting a specific trigger on non-victim pixels during inferences, which is dubbed Influencer Backdoor Attack (IBA).
no code implementations • 20 Mar 2023 • Xi Chen, Shuang Li, Ser-Nam Lim, Antonio Torralba, Hengshuang Zhao
Open-vocabulary image segmentation is attracting increasing attention due to its critical applications in the real world.
no code implementations • 20 Mar 2023 • Xi Chen, Yau Shing Jonathan Cheung, Ser-Nam Lim, Hengshuang Zhao
We hope this could serve as a more powerful and general solution for interactive segmentation.
no code implementations • 14 Mar 2023 • Zhening Huang, Xiaoyang Wu, Hengshuang Zhao, Lei Zhu, Shujun Wang, Georgios Hadjidemetriou, Ioannis Brilakis
For feature aggregation, it improves feature modeling by allowing the network to learn from both local points and neighboring geometry partitions, resulting in an enlarged data-tailored receptive field.
no code implementations • 11 Mar 2023 • Zhao Yang, Jiaqi Wang, Yansong Tang, Kai Chen, Hengshuang Zhao, Philip H. S. Torr
Referring image segmentation segments an image from a language expression.
no code implementations • 7 Feb 2023 • Zitong Yu, Yuming Shen, Jingang Shi, Hengshuang Zhao, Yawen Cui, Jiehua Zhang, Philip Torr, Guoying Zhao
As key modules in PhysFormer, the temporal difference transformers first enhance the quasi-periodic rPPG features with temporal difference guided global attention, and then refine the local spatio-temporal representation against interference.
no code implementations • CVPR 2023 • Zitian Chen, Yikang Shen, Mingyu Ding, Zhenfang Chen, Hengshuang Zhao, Erik G. Learned-Miller, Chuang Gan
To address the MTL challenge, we propose Mod-Squad, a new model that is Modularized into groups of experts (a 'Squad').
no code implementations • 15 Dec 2022 • Zitian Chen, Yikang Shen, Mingyu Ding, Zhenfang Chen, Hengshuang Zhao, Erik Learned-Miller, Chuang Gan
To address the MTL challenge, we propose Mod-Squad, a new model that is Modularized into groups of experts (a 'Squad').
no code implementations • 11 Dec 2022 • Xiaogang Xu, Hengshuang Zhao, Philip Torr, Jiaya Jia
In this paper, we use Deep Generative Networks (DGNs) with a novel training mechanism to eliminate the distribution gap.
1 code implementation • 8 Nov 2022 • Yifei Zhou, Zilu Li, Abhinav Shrivastava, Hengshuang Zhao, Antonio Torralba, Taipeng Tian, Ser-Nam Lim
In this way, the new representation can be directly compared with the old representation, in principle avoiding the need for any backfilling.
2 code implementations • 11 Oct 2022 • Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao
In this work, we analyze the limitations of the Point Transformer and propose our powerful and efficient Point Transformer V2 model with novel designs that overcome the limitations of previous work.
Ranked #2 on
Semantic Segmentation
on S3DIS Area5
no code implementations • 25 Jul 2022 • Jindong Gu, Hengshuang Zhao, Volker Tresp, Philip Torr
Since SegPGD can create more effective adversarial examples, the adversarial training with our SegPGD can boost the robustness of segmentation models.
1 code implementation • 20 Jul 2022 • Xin Lai, Zhuotao Tian, Xiaogang Xu, Yingcong Chen, Shu Liu, Hengshuang Zhao, LiWei Wang, Jiaya Jia
Unsupervised domain adaptation in semantic segmentation has been raised to alleviate the reliance on expensive pixel-wise annotations.
no code implementations • 14 Jul 2022 • Xiaogang Xu, Hengshuang Zhao
Different from existing methods, UADA would adaptively update DA's parameters according to the target model's gradient information during training: given a pre-defined set of DA operations, we randomly decide types and magnitudes of DA operations for every data batch during training, and adaptively update DA's parameters along the gradient direction of the loss concerning DA's parameters.
1 code implementation • CVPR 2022 • Xi Chen, Zhiyan Zhao, Yilei Zhang, Manni Duan, Donglian Qi, Hengshuang Zhao
To make the model work with preexisting masks, we formulate a sub-task termed Interactive Mask Correction, and propose Progressive Merge as the solution.
Ranked #1 on
Interactive Segmentation
on GrabCut
(using extra training data)
4 code implementations • CVPR 2022 • Xin Lai, Jianhui Liu, Li Jiang, LiWei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia
In this paper, we propose Stratified Transformer that is able to capture long-range contexts and demonstrates strong generalization ability and high performance.
Ranked #5 on
Semantic Segmentation
on ScanNet
1 code implementation • CVPR 2022 • Zhao Yang, Jiaqi Wang, Yansong Tang, Kai Chen, Hengshuang Zhao, Philip H. S. Torr
Referring image segmentation is a fundamental vision-language task that aims to segment out an object referred to by a natural language expression from an image.
Ranked #3 on
Referring Expression Segmentation
on RefCOCOg-test
1 code implementation • CVPR 2022 • Zitong Yu, Yuming Shen, Jingang Shi, Hengshuang Zhao, Philip Torr, Guoying Zhao
Remote photoplethysmography (rPPG), which aims at measuring heart activities and physiological signals from facial video without any contact, has great potential in many applications (e. g., remote healthcare and affective computing).
no code implementations • 22 Nov 2021 • Jindong Gu, Hengshuang Zhao, Volker Tresp, Philip Torr
The high transferability achieved by our method shows that, in contrast to the observations in previous work, adversarial examples on a segmentation model can be easy to transfer to other segmentation models.
no code implementations • British Machine Vision Conference 2021 • Zhao Yang, Yansong Tang, Luca Bertinetto, Hengshuang Zhao, Philip Torr
In this paper, we investigate the problem of video object segmentation from referring expressions (VOSRE).
Ranked #1 on
Referring Expression Segmentation
on J-HMDB
(Precision@0.9 metric)
Optical Flow Estimation
Referring Expression Segmentation
+3
1 code implementation • 17 Aug 2021 • Yanwei Li, Hengshuang Zhao, Xiaojuan Qi, Yukang Chen, Lu Qi, LiWei Wang, Zeming Li, Jian Sun, Jiaya Jia
In particular, Panoptic FCN encodes each object instance or stuff category with the proposed kernel generator and produces the prediction by convolving the high-resolution feature directly.
Panoptic Segmentation
Weakly-supervised panoptic segmentation
2 code implementations • 29 Jul 2021 • Lu Qi, Jason Kuen, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia
By removing the need of class label prediction, the models trained for such task can focus more on improving segmentation quality.
1 code implementation • NeurIPS 2021 • Zhongdao Wang, Hengshuang Zhao, Ya-Li Li, Shengjin Wang, Philip H. S. Torr, Luca Bertinetto
We show how most tracking tasks can be solved within this framework, and that the same appearance model can be successfully used to obtain results that are competitive against specialised methods for most of the tasks considered.
Ranked #2 on
Video Object Segmentation
on DAVIS 2017
(mIoU metric)
Multi-Object Tracking
Multi-Object Tracking and Segmentation
+10
2 code implementations • CVPR 2021 • Xin Lai, Zhuotao Tian, Li Jiang, Shu Liu, Hengshuang Zhao, LiWei Wang, Jiaya Jia
Semantic segmentation has made tremendous progress in recent years.
1 code implementation • 4 May 2021 • Zitong Yu, Yunxiao Qin, Hengshuang Zhao, Xiaobai Li, Guoying Zhao
In this paper, we propose two Cross Central Difference Convolutions (C-CDC), which exploit the difference of the center and surround sparse local features from the horizontal/vertical and diagonal directions, respectively.
5 code implementations • CVPR 2021 • Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia
Knowledge distillation transfers knowledge from the teacher network to the student one, with the goal of greatly improving the performance of the student network.
Ranked #9 on
Knowledge Distillation
on CIFAR-100
1 code implementation • CVPR 2021 • WenBo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong
Via the \emph{BPM}, complementary 2D and 3D information can interact with each other in multiple architectural levels, such that advantages in these two visual domains can be combined for better scene recognition.
Ranked #7 on
Semantic Segmentation
on ScanNet
2 code implementations • CVPR 2021 • Mutian Xu, Runyu Ding, Hengshuang Zhao, Xiaojuan Qi
The key of PAConv is to construct the convolution kernel by dynamically assembling basic weight matrices stored in Weight Bank, where the coefficients of these weight matrices are self-adaptively learned from point positions through ScoreNet.
Ranked #2 on
Point Cloud Segmentation
on PointCloud-C
no code implementations • 1 Jan 2021 • Xiaogang Xu, Hengshuang Zhao, Philip Torr, Jiaya Jia
Specifically, compared with previous methods, we propose a more efficient pixel-level training constraint to weaken the hardness of aligning adversarial samples to clean samples, which can thus obviously enhance the robustness on adversarial samples.
5 code implementations • CVPR 2021 • Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip H. S. Torr, Li Zhang
In this paper, we aim to provide an alternative perspective by treating semantic segmentation as a sequence-to-sequence prediction task.
Ranked #1 on
Semantic Segmentation
on FoodSeg103
(using extra training data)
17 code implementations • ICCV 2021 • Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip Torr, Vladlen Koltun
For example, on the challenging S3DIS dataset for large-scale semantic scene segmentation, the Point Transformer attains an mIoU of 70. 4% on Area 5, outperforming the strongest prior model by 3. 3 absolute percentage points and crossing the 70% mIoU threshold for the first time.
Ranked #3 on
3D Semantic Segmentation
on STPLS3D
6 code implementations • CVPR 2021 • Yanwei Li, Hengshuang Zhao, Xiaojuan Qi, LiWei Wang, Zeming Li, Jian Sun, Jiaya Jia
In this paper, we present a conceptually simple, strong, and efficient framework for panoptic segmentation, called Panoptic FCN.
Ranked #1 on
Panoptic Segmentation
on Mapillary val
(PQth metric)
1 code implementation • CVPR 2022 • Zhuotao Tian, Xin Lai, Li Jiang, Shu Liu, Michelle Shu, Hengshuang Zhao, Jiaya Jia
Then, since context is essential for semantic segmentation, we propose the Context-Aware Prototype Learning (CAPL) that significantly improves performance by 1) leveraging the co-occurrence prior knowledge from support samples, and 2) dynamically enriching contextual information to the classifier, conditioned on the content of each query image.
Generalized Few-Shot Semantic Segmentation
Semantic Segmentation
3 code implementations • 4 Aug 2020 • Zhuotao Tian, Hengshuang Zhao, Michelle Shu, Zhicheng Yang, Ruiyu Li, Jiaya Jia
It consists of novel designs of (1) a training-free prior mask generation method that not only retains generalization power but also improves model performance and (2) Feature Enrichment Module (FEM) that overcomes spatial inconsistency by adaptively enriching query features with support features and prior masks.
Ranked #54 on
Few-Shot Semantic Segmentation
on COCO-20i (1-shot)
1 code implementation • CVPR 2020 • Hengshuang Zhao, Jiaya Jia, Vladlen Koltun
Recent work has shown that self-attention can serve as a basic building block for image recognition models.
2 code implementations • CVPR 2020 • Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia
Instance segmentation is an important task for scene understanding.
Ranked #5 on
3D Instance Segmentation
on STPLS3D
1 code implementation • ICCV 2021 • Xiaogang Xu, Hengshuang Zhao, Jiaya Jia
Adversarial training is promising for improving robustness of deep neural networks towards adversarial perturbations, especially on the classification task.
7 code implementations • 13 Jan 2020 • Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia
Then we show limitation of existing information dropping algorithms and propose our structured method, which is simple and yet very effective.
no code implementations • ICCV 2019 • Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia
To incorporate point features in the edge branch, we establish a hierarchical graph framework, where the graph is initialized from a coarse layer and gradually enriched along the point decoding process.
Ranked #25 on
Semantic Segmentation
on S3DIS Area5
no code implementations • 27 Jun 2019 • Zhuotao Tian, Hengshuang Zhao, Michelle Shu, Jiaze Wang, Ruiyu Li, Xiaoyong Shen, Jiaya Jia
Albeit intensively studied, false prediction and unclear boundaries are still major issues of salient object detection.
1 code implementation • CVPR 2019 • Hengshuang Zhao, Li Jiang, Chi-Wing Fu, Jiaya Jia
Unlike previous work, we densely connect each point with every other in a local neighborhood, aiming to specify feature of each point based on the local region characteristics for better representing the region.
Ranked #17 on
Semantic Segmentation
on S3DIS Area5
(oAcc metric)
1 code implementation • CVPR 2019 • Yuwen Xiong, Renjie Liao, Hengshuang Zhao, Rui Hu, Min Bai, Ersin Yumer, Raquel Urtasun
More importantly, we introduce a parameter-free panoptic head which solves the panoptic segmentation via pixel-wise classification.
Ranked #3 on
Panoptic Segmentation
on KITTI Panoptic Segmentation
4 code implementations • ECCV 2018 • Hengshuang Zhao, Yi Zhang, Shu Liu, Jianping Shi, Chen Change Loy, Dahua Lin, Jiaya Jia
We notice information flow in convolutional neural networks is restricted inside local neighborhood regions due to the physical design of convolutional filters, which limits the overall understanding of complex scenes.
Ranked #45 on
Semantic Segmentation
on Cityscapes test
no code implementations • ECCV 2018 • Hengshuang Zhao, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Brian Price, Jiaya Jia
We present a new image search technique that, given a background image, returns compatible foreground objects for image compositing tasks.
no code implementations • ECCV 2018 • Guorun Yang, Hengshuang Zhao, Jianping Shi, Zhidong Deng, Jiaya Jia
Disparity estimation for binocular stereo images finds a wide range of applications.
Ranked #5 on
Semantic Segmentation
on KITTI Semantic Segmentation
no code implementations • 28 Apr 2017 • Xiaoyong Shen, RuiXing Wang, Hengshuang Zhao, Jiaya Jia
A spatial-temporal refinement network is developed to further refine the segmentation errors in each frame and ensure temporal coherence in the segmentation map.
15 code implementations • ECCV 2018 • Hengshuang Zhao, Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, Jiaya Jia
We focus on the challenging task of real-time semantic segmentation in this paper.
Ranked #10 on
Dichotomous Image Segmentation
on DIS-VD
Dichotomous Image Segmentation
Real-Time Semantic Segmentation
+2
61 code implementations • CVPR 2017 • Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia
Scene parsing is challenging for unrestricted open vocabulary and diverse scenes.