no code implementations • 6 Oct 2023 • Weibin Liao, Xuhong LI, Qingzhong Wang, Yanwu Xu, Zhaozheng Yin, Haoyi Xiong
While pre-training on object detection tasks, such as Common Objects in Contexts (COCO) [1], could significantly boost the performance of cell segmentation, it still consumes on massive fine-annotated cell images [2] with bounding boxes, masks, and cell types for every cell in every image, to fine-tune the pre-trained model.
1 code implementation • 6 Oct 2023 • Song Zhang, Qingzhong Wang, Jiang Bian, Haoyi Xiong
While models derived from Vision Transformers (ViTs) have been phonemically surging, pre-trained models cannot seamlessly adapt to arbitrary resolution images without altering the architecture and configuration, such as sampling the positional encoding, limiting their flexibility for various vision tasks.
no code implementations • 3 Oct 2023 • Weibin Liao, Haoyi Xiong, Qingzhong Wang, Yan Mo, Xuhong LI, Yi Liu, Zeyu Chen, Siyu Huang, Dejing Dou
In this work, we study a novel self-supervised pre-training pipeline, namely Multi-task Self-super-vised Continual Learning (MUSCLE), for multiple medical imaging tasks, such as classification and segmentation, using X-ray images collected from multiple body parts, including heads, lungs, and bones.
no code implementations • 24 Feb 2023 • Yuxuan Zhang, Qingzhong Wang, Jiang Bian, Yi Liu, Yanwu Xu, Dejing Dou, Haoyi Xiong
Due to the high similarity between MRI data and videos, we conduct extensive empirical studies on video recognition techniques for MRI classification to answer the questions: (1) can we directly use video recognition models for MRI classification, (2) which model is more appropriate for MRI, (3) are the common tricks like data augmentation in video recognition still useful for MRI classification?
1 code implementation • 5 Aug 2022 • Yongsong Huang, Qingzhong Wang, Shinichiro Omachi
To the best of our knowledge, this is the first composite degradation model proposed for radiographic images.
no code implementations • 26 Jul 2022 • Jiang Bian, Xuhong LI, Tao Wang, Qingzhong Wang, Jun Huang, Chen Liu, Jun Zhao, Feixiang Lu, Dejing Dou, Haoyi Xiong
While deep learning has been widely used for video analytics, such as video classification and action detection, dense action detection with fast-moving subjects from sports videos is still challenging.
no code implementations • 4 Jul 2022 • Xueying Zhan, Zeyu Dai, Qingzhong Wang, Qing Li, Haoyi Xiong, Dejing Dou, Antoni B. Chan
In this paper, we propose a sampling scheme, Monte-Carlo Pareto Optimization for Active Learning (POAL), which selects optimal subsets of unlabeled samples with fixed batch size from the unlabeled data pool.
1 code implementation • 2 Jun 2022 • Fei Wu, Qingzhong Wang, Jian Bian, Haoyi Xiong, Ning Ding, Feixiang Lu, Jun Cheng, Dejing Dou
Finally, we discuss the challenges and unsolved problems in this area and to facilitate sports analytics, we develop a toolbox using PaddlePaddle, which supports football, basketball, table tennis and figure skating action recognition.
no code implementations • 20 May 2022 • Qingzhong Wang, Haifang Li, Haoyi Xiong, Wen Wang, Jiang Bian, Yu Lu, Shuaiqiang Wang, Zhicong Cheng, Dejing Dou, Dawei Yin
To handle the diverse query requests from users at web-scale, Baidu has done tremendous efforts in understanding users' queries, retrieve relevant contents from a pool of trillions of webpages, and rank the most relevant webpages on the top of results.
no code implementations • 8 Apr 2022 • Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan
First, we propose a distinctiveness metric -- between-set CIDEr (CIDErBtw) to evaluate the distinctiveness of a caption with respect to those of similar images.
1 code implementation • 25 Mar 2022 • Xueying Zhan, Qingzhong Wang, Kuan-Hao Huang, Haoyi Xiong, Dejing Dou, Antoni B. Chan
In this work, We construct a DAL toolkit, DeepAL+, by re-implementing 19 highly-cited DAL methods.
1 code implementation • 21 Sep 2021 • Yihang Yin, Qingzhong Wang, Siyu Huang, Haoyi Xiong, Xiang Zhang
Most of the existing contrastive learning methods employ pre-defined view generation methods, e. g., node drop or edge perturbation, which usually cannot adapt to input data or preserve the original semantic structures well.
no code implementations • 2 Sep 2021 • Yongsong Huang, Zetao Jiang, Qingzhong Wang, Qi Jiang, Guoming Pang
Recently, deep learning methods have dominated image super-resolution and achieved remarkable performance on visible images; however, IR images have received less attention.
no code implementations • 20 Aug 2021 • Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan
In particular, we propose a group-based memory attention (GMA) module, which stores object features that are unique among the image group (i. e., with low similarity to objects in other images).
1 code implementation • 19 Jul 2021 • Qingzhong Wang, Pengfei Zhang, Haoyi Xiong, Jian Zhao
In this paper, we develop face. evoLVe -- a comprehensive library that collects and implements a wide range of popular deep learning-based methods for face recognition.
1 code implementation • 17 Jul 2020 • Siyu Huang, Haoyi Xiong, Zhi-Qi Cheng, Qingzhong Wang, Xingran Zhou, Bihan Wen, Jun Huan, Dejing Dou
Generation of high-quality person images is challenging, due to the sophisticated entanglements among image factors, e. g., appearance, pose, foreground, background, local details, global structures, etc.
no code implementations • ECCV 2020 • Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan
A wide range of image captioning models has been developed, achieving significant improvement based on popular metrics, such as BLEU, CIDEr, and SPICE.
1 code implementation • 14 May 2020 • Di Hu, Lichao Mou, Qingzhong Wang, Junyu. Gao, Yuansheng Hua, Dejing Dou, Xiao Xiang Zhu
Visual crowd counting has been recently studied as a way to enable people counting in crowd scenes from images.
1 code implementation • 17 Mar 2020 • Siyu Huang, Haoyi Xiong, Tianyang Wang, Bihan Wen, Qingzhong Wang, Zeyu Chen, Jun Huan, Dejing Dou
This paper further presents a real-time feed-forward model to leverage Style Projection for arbitrary image style transfer, which includes a regularization term for matching the semantics between input contents and stylized outputs.
1 code implementation • 14 Aug 2019 • Qingzhong Wang, Antoni B. Chan
Although significant progress has been made in the field of automatic image captioning, it is still a challenging task.
1 code implementation • CVPR 2019 • Qingzhong Wang, Antoni B. Chan
We find that there is still a large gap between the model and human performance in terms of both accuracy and diversity and the models that have optimized accuracy (CIDEr) have low diversity.
1 code implementation • 30 Oct 2018 • Qingzhong Wang, Antoni B. Chan
Attention modules connecting encoder and decoders have been widely applied in the field of object recognition, image captioning, visual question answering and neural machine translation, and significantly improves the performance.
1 code implementation • 23 May 2018 • Qingzhong Wang, Antoni B. Chan
We also test our model on the paragraph annotation dataset, and get higher CIDEr score compared with hierarchical LSTMs