no code implementations • 24 Sep 2024 • Mingxing Peng, Kehua Chen, Xusen Guo, Qiming Zhang, Hongliang Lu, Hui Zhong, Di Chen, Meixin Zhu, Hai Yang
Through this structured overview, we aim to provide researchers with a comprehensive understanding of diffusion models for ITS, thereby advancing their future applications in the transportation domain.
1 code implementation • 7 Sep 2024 • Mingjin Zhang, Chi Zhang, Qiming Zhang, Yunsong Li, Xinbo Gao, Jing Zhang
Recent advancements in deep learning have greatly advanced the field of infrared small object detection (IRSTD).
1 code implementation • 16 May 2024 • Wentao Jiang, Jing Zhang, Di Wang, Qiming Zhang, Zengmao Wang, Bo Du
Experimental results in classification and dense prediction tasks show that LeMeViT has a significant $1. 7 \times$ speedup, fewer parameters, and competitive performance compared to the baseline models, and achieves a better trade-off between efficiency and performance.
1 code implementation • 3 Apr 2024 • Xusen Guo, Qiming Zhang, Junyue Jiang, Mingxing Peng, Meixin Zhu, Hao, Yang
Achieving both accuracy and explainability in traffic prediction models remains a challenge due to the complexity of traffic data and the inherent opacity of deep learning models.
1 code implementation • 23 Jan 2024 • Lincan Li, Wei Shao, Wei Dong, Yijun Tian, Qiming Zhang, Kaixiang Yang, Wenjie Zhang
There has been a huge bottleneck regarding the upper bound of autonomous driving algorithm performance, a consensus from academia and industry believes that the key to surmount the bottleneck lies in data-centric autonomous driving technology.
1 code implementation • ICCV 2023 • Mingjin Zhang, Chi Zhang, Qiming Zhang, Jie Guo, Xinbo Gao, Jing Zhang
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation.
1 code implementation • 3 May 2023 • Tao Chen, Liang Lv, Di Wang, Jing Zhang, Yue Yang, Zeyang Zhao, Chen Wang, Xiaowei Guo, Hao Chen, Qingye Wang, Yufei Xu, Qiming Zhang, Bo Du, Liangpei Zhang, DaCheng Tao
With the world population rapidly increasing, transforming our agrifood systems to be more productive, efficient, safe, and sustainable is crucial to mitigate potential food shortages.
2 code implementations • 29 Mar 2023 • Haimei Zhao, Qiming Zhang, Shanshan Zhao, Zhe Chen, Jing Zhang, DaCheng Tao
Multi-view camera-based 3D object detection has become popular due to its low cost, but accurately inferring 3D geometry solely from camera data remains challenging and may lead to inferior performance.
1 code implementation • 27 Mar 2023 • Qiming Zhang, Jing Zhang, Yufei Xu, DaCheng Tao
Window-based attention has become a popular choice in vision transformers due to its superior performance, lower computational complexity, and less memory footprint.
2 code implementations • 7 Dec 2022 • Yufei Xu, Jing Zhang, Qiming Zhang, DaCheng Tao
In this paper, we show the surprisingly good properties of plain vision transformers for body pose estimation from various aspects, namely simplicity in model structure, scalability in model size, flexibility in training paradigm, and transferability of knowledge between models, through a simple baseline model dubbed ViTPose.
Ranked #1 on
Animal Pose Estimation
on AP-10K
(using extra training data)
no code implementations • 24 Nov 2022 • Benjamin Kiefer, Matej Kristan, Janez Perš, Lojze Žust, Fabio Poiesi, Fabio Augusto de Alcantara Andrade, Alexandre Bernardino, Matthew Dawkins, Jenni Raitoharju, Yitong Quan, Adem Atmaca, Timon Höfer, Qiming Zhang, Yufei Xu, Jing Zhang, DaCheng Tao, Lars Sommer, Raphael Spraul, Hangyue Zhao, Hongpu Zhang, Yanyun Zhao, Jan Lukas Augustin, Eui-ik Jeon, Impyeong Lee, Luca Zedda, Andrea Loddo, Cecilia Di Ruberto, Sagar Verma, Siddharth Gupta, Shishir Muralidhara, Niharika Hegde, Daitao Xing, Nikolaos Evangeliou, Anthony Tzes, Vojtěch Bartl, Jakub Špaňhel, Adam Herout, Neelanjan Bhowmik, Toby P. Breckon, Shivanand Kundargi, Tejas Anvekar, Chaitra Desai, Ramesh Ashok Tabib, Uma Mudengudi, Arpita Vats, Yang song, Delong Liu, Yonglin Li, Shuman Li, Chenhao Tan, Long Lan, Vladimir Somers, Christophe De Vleeschouwer, Alexandre Alahi, Hsiang-Wei Huang, Cheng-Yen Yang, Jenq-Neng Hwang, Pyong-Kun Kim, Kwangju Kim, Kyoungoh Lee, Shuai Jiang, Haiwen Li, Zheng Ziqiang, Tuan-Anh Vu, Hai Nguyen-Truong, Sai-Kit Yeung, Zhuang Jia, Sophia Yang, Chih-Chung Hsu, Xiu-Yu Hou, Yu-An Jhang, Simon Yang, Mau-Tsuen Yang
The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection.
no code implementations • 3 Nov 2022 • Yufei Xu, Jing Zhang, Qiming Zhang, DaCheng Tao
Self-supervised pre-training vision transformer (ViT) via masked image modeling (MIM) has been proven very effective.
2 code implementations • 8 Aug 2022 • Di Wang, Qiming Zhang, Yufei Xu, Jing Zhang, Bo Du, DaCheng Tao, Liangpei Zhang
Large-scale vision foundation models have made significant progress in visual tasks on natural images, with vision transformers being the primary choice due to their good scalability and representation ability.
Ranked #1 on
Aerial Scene Classification
on AID (50% as trainset)
1 code implementation • 11 Jun 2022 • Wei Li, Qiming Zhang, Jing Zhang, Zhen Huang, Xinmei Tian, DaCheng Tao
To address these issues, we establish a new high-quality dataset named RealRain-1k, consisting of $1, 120$ high-resolution paired clean and rainy images with low- and high-density rain streaks, respectively.
6 code implementations • 26 Apr 2022 • Yufei Xu, Jing Zhang, Qiming Zhang, DaCheng Tao
In this paper, we show the surprisingly good capabilities of plain vision transformers for pose estimation from various aspects, namely simplicity in model structure, scalability in model size, flexibility in training paradigm, and transferability of knowledge between models, through a simple baseline model called ViTPose.
Ranked #1 on
Pose Estimation
on COCO test-dev
2 code implementations • 18 Apr 2022 • Qiming Zhang, Yufei Xu, Jing Zhang, DaCheng Tao
Attention within windows has been widely explored in vision transformers to balance the performance, computation complexity, and memory footprint.
8 code implementations • 21 Feb 2022 • Qiming Zhang, Yufei Xu, Jing Zhang, DaCheng Tao
Vision transformers have shown great potential in various computer vision tasks owing to their strong capability to model long-range dependency using the self-attention mechanism.
Ranked #2 on
Image Classification
on ImageNet ReaL
2 code implementations • 24 Nov 2021 • Yufei Xu, Qiming Zhang, Jing Zhang, DaCheng Tao
In this paper, we make the first attempt to demonstrate the importance of both regions in cropping from a complete perspective and propose a simple yet effective pretext task called Region Contrastive Learning (RegionCL).
2 code implementations • NeurIPS 2021 • Yufei Xu, Qiming Zhang, Jing Zhang, DaCheng Tao
Nevertheless, vision transformers treat an image as 1D sequence of visual tokens, lacking an intrinsic inductive bias (IB) in modeling local visual structures and dealing with scale variance.
Ranked #2 on
Video Object Segmentation
on DAVIS 2017
no code implementations • 31 Dec 2020 • Agrim Gupta, Cedric Girerd, Manideep Dunna, Qiming Zhang, Raghav Subbaraman, Tania Morimoto, Dinesh Bharadia
Contact force is a natural way for humans to interact with the physical world around us.
1 code implementation • 27 Nov 2019 • Haoyu He, Jing Zhang, Qiming Zhang, DaCheng Tao
In this paper, we propose a novel GRAph PYramid Mutual Learning (Grapy-ML) method to address the cross-dataset human parsing problem, where the annotations are at different granularities.
1 code implementation • NeurIPS 2019 • Qiming Zhang, Jing Zhang, Wei Liu, DaCheng Tao
Although there has been a progress in matching the marginal distributions between two domains, the classifier favors the source domain features and makes incorrect predictions on the target domain due to category-agnostic feature alignment.
Ranked #24 on
Image-to-Image Translation
on SYNTHIA-to-Cityscapes
no code implementations • NIPS Workshop CDNNRIA 2018 • Huan Wang, Qiming Zhang, Yuehai Wang, Haoji Hu
Parameter pruning is a promising approach for CNN compression and acceleration by eliminating redundant model parameters with tolerable performance loss.
1 code implementation • 25 Apr 2018 • Huan Wang, Qiming Zhang, Yuehai Wang, Yu Lu, Haoji Hu
Parameter pruning is a promising approach for CNN compression and acceleration by eliminating redundant model parameters with tolerable performance degrade.
2 code implementations • 20 Sep 2017 • Huan Wang, Qiming Zhang, Yuehai Wang, Haoji Hu
Unlike existing deterministic pruning approaches, where unimportant weights are permanently eliminated, SPP introduces a pruning probability for each weight, and pruning is guided by sampling from the pruning probabilities.