1 code implementation • 8 Jan 2024 • Chenhongyi Yang, Tianwei Lin, Lichao Huang, Elliot J. Crowley
We present WidthFormer, a novel transformer-based module to compute Bird's-Eye-View (BEV) representations from multi-view cameras for real-time autonomous-driving applications.
1 code implementation • 15 Dec 2023 • Longzhong Lin, Xuewu Lin, Tianwei Lin, Lichao Huang, Rong Xiong, Yue Wang
Motion prediction is a crucial task in autonomous driving, and one of its major challenges lands in the multimodality of future behaviors.
1 code implementation • 20 Nov 2023 • Xuewu Lin, Zixiang Pei, Tianwei Lin, Lichao Huang, Zhizhong Su
We introduce two auxiliary training tasks (Temporal Instance Denoising and Quality Estimation) and propose decoupled attention to make structural improvements, leading to significant enhancements in detection performance.
1 code implementation • 23 May 2023 • Xuewu Lin, Tianwei Lin, Zixiang Pei, Lichao Huang, Zhizhong Su
Firstly, it reduces the computational complexity of temporal fusion from $O(T)$ to $O(1)$, resulting in significant improvements in inference speed and memory usage.
1 code implementation • CVPR 2024 • Chenhongyi Yang, Lichao Huang, Elliot J. Crowley
To overcome this challenge, we introduce Plug and Play Active Learning (PPAL), a simple and effective AL strategy for object detection.
1 code implementation • 19 Nov 2022 • Xuewu Lin, Tianwei Lin, Zixiang Pei, Lichao Huang, Zhizhong Su
Bird-eye-view (BEV) based methods have made great progress recently in multi-view 3D detection task.
Ranked #10 on Robust Camera Only 3D Object Detection on nuScenes-C
1 code implementation • 26 Nov 2021 • Chenhongyi Yang, Lichao Huang, Elliot J. Crowley
The goal of contrastive learning based pre-training is to leverage large quantities of unlabeled data to produce a model that can be readily adapted downstream.
1 code implementation • 25 Mar 2021 • Xinggang Wang, Zhaojin Huang, Bencheng Liao, Lichao Huang, Yongchao Gong, Chang Huang
Based on deep networks, video object detection is actively studied for pushing the limits of detection speed and accuracy.
1 code implementation • ECCV 2020 • Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu
Besides, it is not surprising to observe that BMask R-CNN obtains more obvious improvement when the evaluation criterion requires better localization (e. g., AP$_{75}$) as shown in Fig. 1.
3 code implementations • CVPR 2020 • Yiqun Mei, Yuchen Fan, Yuqian Zhou, Lichao Huang, Thomas S. Huang, Humphrey Shi
By combining the new CS-NL prior with local and in-scale non-local priors in a powerful recurrent fusion cell, we can find more cross-scale feature correlations within a single low-resolution (LR) image.
Ranked #10 on Image Super-Resolution on Manga109 - 3x upscaling
no code implementations • 4 Mar 2020 • Tao Hu, Lichao Huang, Han Shen
Recent works in multiple object tracking use sequence model to calculate the similarity score between the detections and the previous tracklets.
no code implementations • 13 Dec 2019 • Haojie Liu, Han Shen, Lichao Huang, Ming Lu, Tong Chen, Zhan Ma
Traditional video compression technologies have been developed over decades in pursuit of higher coding efficiency.
1 code implementation • 11 Dec 2019 • Shaoru Wang, Yongchao Gong, Junliang Xing, Lichao Huang, Chang Huang, Weiming Hu
To reciprocate these two tasks, we design a two-stream structure to learn features on both the object level (i. e., bounding boxes) and the pixel level (i. e., instance masks) jointly.
Ranked #94 on Instance Segmentation on COCO test-dev
1 code implementation • 10 Dec 2019 • Yonglin Tian, Lichao Huang, Xuesong Li, Kunfeng Wang, Zilei Wang, Fei-Yue Wang
Varying density of point clouds increases the difficulty of 3D detection.
1 code implementation • 2 Aug 2019 • Tao Hu, Lichao Huang, Xian-Ming Liu, Han Shen
Our tracker achieves leading performance in OTB2013, OTB2015, VOT2015, VOT2016 and LaSOT, and operates at a real-time speed of 26 FPS, which indicates our method is effective and practical.
no code implementations • 11 Jul 2019 • Hao Luo, Lichao Huang, Han Shen, Yuan Li, Chang Huang, Xinggang Wang
Without any bells and whistles, our method obtains 80. 3\% mAP on the ImageNet VID dataset, which is superior over the previous state-of-the-arts.
1 code implementation • 2 Jul 2019 • Qiang Zhou, Zilong Huang, Lichao Huang, Yongchao Gong, Han Shen, Chang Huang, Wenyu Liu, Xinggang Wang
Video object segmentation (VOS) aims at pixel-level object tracking given only the annotations in the first frame.
Ranked #1 on Visual Object Tracking on YouTube-VOS 2018 (Jaccard (Seen) metric)
3 code implementations • CVPR 2019 • Zhaojin Huang, Lichao Huang, Yongchao Gong, Chang Huang, Xinggang Wang
In this paper, we study this problem and propose Mask Scoring R-CNN which contains a network block to learn the quality of the predicted instance masks.
Ranked #75 on Instance Segmentation on COCO minival
4 code implementations • ICCV 2019 • Zilong Huang, Xinggang Wang, Yunchao Wei, Lichao Huang, Humphrey Shi, Wenyu Liu, Thomas S. Huang
Compared with the non-local block, the proposed recurrent criss-cross attention module requires 11x less GPU memory usage.
Ranked #7 on Semantic Segmentation on FoodSeg103 (using extra training data)
no code implementations • 5 Aug 2018 • Han Shen, Lichao Huang, Chang Huang, Wei Xu
The separation of the task requires to define a hand-crafted training goal in affinity learning stage and a hand-crafted cost function of data association stage, which prevents the tracking goals from learning directly from the feature.
4 code implementations • 17 Oct 2016 • Yiyi Liao, Lichao Huang, Yue Wang, Sarath Kodagoda, Yinan Yu, Yong liu
Many standard robotic platforms are equipped with at least a fixed 2D laser range finder and a monocular camera.
2 code implementations • 16 Sep 2015 • Lichao Huang, Yi Yang, Yafeng Deng, Yinan Yu
How can a single fully convolutional neural network (FCN) perform on object detection?
no code implementations • CVPR 2013 • Xiaolong Wang, Liang Lin, Lichao Huang, Shuicheng Yan
This paper proposes a reconfigurable model to recognize and detect multiclass (or multiview) objects with large variation in appearance.