no code implementations • 8 Aug 2022 • Hannan Lu, Zhi Tian, Lirong Yang, Haibing Ren, WangMeng Zuo
The compact instance stream effectively improves the segmentation accuracy of the unseen pixels, while fusing two streams with the adaptive routing map leads to an overall performance boost.
1 code implementation • 19 Jul 2022 • Yusheng Zhao, Jinyu Chen, Chen Gao, Wenguan Wang, Lirong Yang, Haibing Ren, Huaxia Xia, Si Liu
Vision-language navigation is the task of directing an embodied agent to navigate in 3D scenes with natural language instructions.
no code implementations • 29 Apr 2022 • Chang Shu, Ziming Chen, Lei Chen, Kuan Ma, Minghui Wang, Haibing Ren
To the best of our knowledge, this is the first work to show that transformer-based networks can attain state-of-the-art performance in real-time in the single image depth estimation field.
1 code implementation • CVPR 2022 • Junyu Luo, Jiahui Fu, Xianghao Kong, Chen Gao, Haibing Ren, Hao Shen, Huaxia Xia, Si Liu
3D visual grounding aims to locate the referred target object in 3D point cloud scenes according to a free-form language description.
2 code implementations • 30 Mar 2022 • Chengjian Feng, Yujie Zhong, Zequn Jie, Xiangxiang Chu, Haibing Ren, Xiaolin Wei, Weidi Xie, Lin Ma
The goal of this work is to establish a scalable pipeline for expanding an object detector towards novel/unseen categories, using zero manual annotations.
8 code implementations • NeurIPS 2021 • Xiangxiang Chu, Zhi Tian, Yuqing Wang, Bo Zhang, Haibing Ren, Xiaolin Wei, Huaxia Xia, Chunhua Shen
Very recently, a variety of vision transformer architectures for dense prediction tasks have been proposed and they show that the design of spatial attention is critical to their success in these tasks.
Ranked #46 on
Semantic Segmentation
on ADE20K val
1 code implementation • CVPR 2021 • Haochen Wang, XiaoLong Jiang, Haibing Ren, Yao Hu, Song Bai
In this work we present SwiftNet for real-time semisupervised video object segmentation (one-shot VOS), which reports 77. 8% J &F and 70 FPS on DAVIS 2017 validation dataset, leading all present solutions in overall accuracy and speed performance.
Semantic Segmentation
Semi-Supervised Video Object Segmentation
+1