no code implementations • 19 May 2023 • Yongsheng Yu, Hao Wang, Tiejian Luo, Heng Fan, Libo Zhang
In this paper, we propose a novel, simple yet effective method for Multi-modal Guided Image Completion, dubbed MaGIC, which not only supports a wide range of single modality as the guidance (e. g., text, canny edge, sketch, segmentation, reference image, depth, and pose), but also adapts to arbitrarily customized combination of these modalities (i. e., arbitrary multi-modality) for image completion.
no code implementations • 22 Apr 2023 • Bohai Gu, Heng Fan, Libo Zhang
Current arbitrary style transfer models are limited to either image or video domains.
no code implementations • 20 Mar 2023 • Zhenyu Li, Zhipeng Zhang, Heng Fan, Yuan He, Ke Wang, Xianming Liu, Junjun Jiang
In this paper, we improve the challenging monocular 3D object detection problem with a general semi-supervised framework.
1 code implementation • 19 Mar 2023 • Srikar Yellapragada, Zhenghong Li, Kevin Bhadresh Doshi, Purva Makarand Mhasakar, Heng Fan, Jie Wei, Erik Blasch, Haibin Ling
In this paper, we present a meticulously crafted and annotated benchmark, called \textbf{CCTV-Gun}, which addresses the challenges of detecting handguns in real-world CCTV images.
no code implementations • 14 Mar 2023 • Xinran Liu, Xiaoqiong Liu, Ziruo Yi, Xin Zhou, Thanh Le, Libo Zhang, Yan Huang, Qing Yang, Heng Fan
In addition, we further derive a variant named PlanarTrack$_{\mathbf{BB}}$ for generic object tracking from PlanarTrack.
no code implementations • 1 Jan 2023 • Libo Zhang, Wenzhang Zhou, Heng Fan, Tiejian Luo, Haibin Ling
To reduce discrepancy in feature distributions between two domains, recent approaches achieve domain adaption through feature alignment in different granularities via adversarial learning.
3 code implementations • 19 Nov 2022 • Libo Zhang, Lutao Jiang, Ruyi Ji, Heng Fan
Automatic security inspection relying on computer vision technology is a challenging task in real-world scenarios due to many factors, such as intra-class variance, class imbalance, and occlusion.
no code implementations • 25 Aug 2022 • Yongsheng Yu, Libo Zhang, Heng Fan, Tiejian Luo
Addressing this problem, in this paper, we devise a novel GAN inversion model for image inpainting, dubbed InvertFill, mainly consisting of an encoder with a pre-modulation module and a GAN generator with F&W+ latent space.
1 code implementation • 3 Jul 2022 • Mingzhe Guo, Zhipeng Zhang, Heng Fan, Liping Jing
By revealing the potential of VL representation, we expect the community to divert more attention to VL tracking and hope to open more possibilities for future tracking beyond Transformer.
no code implementations • 30 Apr 2022 • Libo Zhang, Junyuan Gao, Zhen Xiao, Heng Fan
Multi-animal tracking (MAT), a multi-object tracking (MOT) problem, is crucial for animal motion and behavior analysis and has many crucial applications such as biology, ecology and animal conservation.
no code implementations • 7 Jan 2022 • Mingzhe Guo, Zhipeng Zhang, Heng Fan, Liping Jing, Yilin Lyu, Bing Li, Weiming Hu
The proposed GIM module and InBN mechanism are general and applicable to different backbone types including CNN and Transformer for improvements, as evidenced by our extensive experiments on multiple benchmarks.
1 code implementation • 2 Dec 2021 • Liting Lin, Heng Fan, Zhipeng Zhang, Yong Xu, Haibin Ling
The potential of Transformer in representation learning remains under-explored.
Ranked #6 on
Visual Object Tracking
on TrackingNet
no code implementations • 19 Oct 2021 • Heng Fan, Jiaxiang Ren, Jie Yang, Yi-Xian Qin, Haibin Ling
The aim of this study was to investigate whether a deep convolutional neural network (CNN) with an attention module can detect osteoporosis on panoramic radiographs.
1 code implementation • 19 Jul 2021 • Dawei Du, Longyin Wen, Pengfei Zhu, Heng Fan, QinGhua Hu, Haibin Ling, Mubarak Shah, Junwen Pan, Ali Al-Ali, Amr Mohamed, Bakour Imene, Bin Dong, Binyu Zhang, Bouchali Hadia Nesma, Chenfeng Xu, Chenzhen Duan, Ciro Castiello, Corrado Mencar, Dingkang Liang, Florian Krüger, Gennaro Vessio, Giovanna Castellano, Jieru Wang, Junyu Gao, Khalid Abualsaud, Laihui Ding, Lei Zhao, Marco Cianciotta, Muhammad Saqib, Noor Almaadeed, Omar Elharrouss, Pei Lyu, Qi Wang, Shidong Liu, Shuang Qiu, Siyang Pan, Somaya Al-Maadeed, Sultan Daud Khan, Tamer Khattab, Tao Han, Thomas Golda, Wei Xu, Xiang Bai, Xiaoqing Xu, Xuelong Li, Yanyun Zhao, Ye Tian, Yingnan Lin, Yongchao Xu, Yuehan Yao, Zhenyu Xu, Zhijian Zhao, Zhipeng Luo, Zhiwei Wei, Zhiyuan Zhao
Crowd counting on the drone platform is an interesting topic in computer vision, which brings new challenges such as small object inference, background clutter and wide viewpoint.
no code implementations • 25 Nov 2020 • Heng Fan, Haibin Ling
The key is to bridge box regression and classification via an alignment step, which leads to more accurate features for proposal classification with improved robustness.
no code implementations • ICCV 2021 • Heng Fan, Halady Akhilesha Miththanthaya, Harshit, Siranjiv Ramana Rajan, Xiaoqiong Liu, Zhilin Zou, Yuewei Lin, Haibin Ling
To the best of our knowledge, TOTB is the first benchmark dedicated to transparent object tracking.
1 code implementation • 8 Sep 2020 • Heng Fan, Hexin Bai, Liting Lin, Fan Yang, Peng Chu, Ge Deng, Sijia Yu, Harshit, Mingzhen Huang, Juehuan Liu, Yong Xu, Chunyuan Liao, Lin Yuan, Haibin Ling
The average video length of LaSOT is around 2, 500 frames, where each video contains various challenge factors that exist in real world video footage, such as the targets disappearing and re-appearing.
2 code implementations • 16 Jan 2020 • Pengfei Zhu, Longyin Wen, Dawei Du, Xiao Bian, Heng Fan, QinGhua Hu, Haibin Ling
We provide a large-scale drone captured dataset, VisDrone, which includes four tracks, i. e., (1) image object detection, (2) video object detection, (3) single object tracking, and (4) multi-object tracking.
no code implementations • 15 Dec 2019 • Jianqing Jia, Semir Elezovikj, Heng Fan, Shuojin Yang, Jing Liu, Wei Guo, Chiu C. Tan, Haibin Ling
Our solution encodes the constraints for placing labels in an optimization problem to obtain the final label layout, and the labels will be placed in appropriate positions to reduce the chances of overlaying important real-world objects in street view AR scenarios.
no code implementations • 18 Nov 2019 • Heng Fan, Fan Yang, Peng Chu, Lin Yuan, Haibin Ling
For the analysis component, given the tracking results on all sequences, it investigates the behavior of the tracker under each individual factor and generates the report automatically.
1 code implementation • 25 Oct 2019 • Liu Ying, Heng Fan, Fuchuan Ni, Jinhai Xiang
In addition, to further improve the transfer accuracy of generated images, an attribute adversarial classifier (referred to as Atta-cls) is introduced to guide the generator from the perspective of attribute through learning the defects of attribute transfer images.
no code implementations • 7 Sep 2019 • Zhen-Biao Yang, Pei-Rong Han, Xin-Jie Huang, Wen Ning, HekangLi, Kai Xu, Dongning Zheng, Heng Fan, Shi-Biao Zheng
No-cloning theorem forbids perfect cloning of an unknown quantum state.
Quantum Physics
1 code implementation • ICCV 2019 • Fan Yang, Heng Fan, Peng Chu, Erik Blasch, Haibin Ling
The key components in ClusDet include a cluster proposal sub-network (CPNet), a scale estimation sub-network (ScaleNet), and a dedicated detection network (DetecNet).
no code implementations • 21 Feb 2019 • Peng Chu, Heng Fan, Chiu C. Tan, Haibin Ling
To address this issue, in this paper we propose an instance-aware tracker to integrate SOT techniques for MOT by encoding awareness both within and between target models.
no code implementations • CVPR 2019 • Heng Fan, Haibin Ling
C-RPN is trained end-to-end with the multi-task loss function.
no code implementations • 9 Nov 2018 • Heng Fan, Peng Chu, Longin Jan Latecki, Haibin Ling
Recurrent neural networks (RNNs) have shown the ability to improve scene parsing through capturing long-range dependencies among image units.
1 code implementation • CVPR 2019 • Heng Fan, Liting Lin, Fan Yang, Peng Chu, Ge Deng, Sijia Yu, Hexin Bai, Yong Xu, Chunyuan Liao, Haibin Ling
In this paper, we present LaSOT, a high-quality benchmark for Large-scale Single Object Tracking.
no code implementations • 15 May 2018 • Qin Zhou, Heng Fan, Hua Yang, Hang Su, Shibao Zheng, Shuang Wu, Haibin Ling
To address this problem, in this paper, we present a robust and efficient graph correspondence transfer (REGCT) approach for explicit spatial alignment in Re-ID.
no code implementations • 1 Apr 2018 • Qin Zhou, Heng Fan, Shibao Zheng, Hang Su, Xinzhe Li, Shuang Wu, Haibin Ling
In this paper, we propose a graph correspondence transfer (GCT) approach for person re-identification.
no code implementations • 22 Mar 2018 • Zhigang Chang, Qin Zhou, Heng Fan, Hang Su, Hua Yang, Shibao Zheng, Haibin Ling
Meanwhile, a weighting scheme is applied on the bilinear coding to adaptively adjust the weights of local features at different locations based on their importance in recognition, further improving the discriminability of feature aggregation.
no code implementations • 30 Jan 2018 • Heng Fan, Haibin Ling
Being intensively studied, visual object tracking has witnessed great advances in either speed (e. g., with correlation filters) or accuracy (e. g., with deep features).
no code implementations • 21 Jan 2018 • Heng Fan, Haibin Ling
Recently recurrent neural networks (RNNs) have demonstrated the ability to improve scene labeling through capturing long-range dependencies among image units.
1 code implementation • 8 Dec 2017 • Jun Wang, Zhao-Yu Han, Song-Bo Wang, Zeyang Li, Liang-Zhu Mu, Heng Fan, Lei Wang
We propose a quantum tomography scheme for pure qudit systems which adopts random base measurements and generative learning methods, along with a built-in fidelity estimation approach to assess the reliability of the tomographic states.
Quantum Physics
1 code implementation • 6 Sep 2017 • Zhao-Yu Han, Jun Wang, Heng Fan, Lei Wang, Pan Zhang
Generative modeling, which learns joint probability distribution from data and generates samples according to it, is an important task in machine learning and artificial intelligence.
no code implementations • ICCV 2017 • Heng Fan, Haibin Ling
In this paper we study the problem from a new perspective and present a novel parallel tracking and verifying (PTAV) framework, by taking advantage of the ubiquity of multi-thread techniques and borrowing from the success of parallel tracking and mapping in visual SLAM.
no code implementations • 21 Nov 2016 • Heng Fan, Haibin Ling
Convolutional neural network (CNN) has drawn increasing interest in visual tracking owing to its powerfulness in feature extraction.
no code implementations • 8 Jul 2016 • Heng Fan, Xue Mei, Danil Prokhorov, Haibin Ling
Context in image is crucial for scene labeling while existing methods only exploit local context generated from a small surrounding area of an image patch or a pixel, by contrast long-range and global contextual information is ignored.