no code implementations • 21 Dec 2024 • Zijiang Yang, Zhongwei Qiu, Tiancheng Lin, Hanqing Chao, Wanxing Chang, Yelin Yang, Yunshuo Zhang, Wenpei Jiao, Yixuan Shen, wenbin liu, Dongmei Fu, Dakai Jin, Ke Yan, Le Lu, Hui Jiang, Yun Bian
Thus, it remains an open question whether deep learning models can directly and effectively analyze WSIs from the semantic aspect of cell distributions.
no code implementations • 12 Sep 2024 • Guangxuan Song, Dongmei Fu, Zhongwei Qiu, Jintao Meng, Dawei Zhang
Uncertainty estimation is crucial in scientific data for machine learning.
no code implementations • 12 Aug 2024 • Ke Zhou, Zhongwei Qiu, Dongmei Fu
In this paper, we introduce a novel Multi-scale Contrastive Adaptor learning method named MCA-SAM, which enhances adaptor performance through a meticulously designed contrastive learning framework at both token and sample levels.
no code implementations • 15 Dec 2023 • Guangxuan Song, Dongmei Fu, Zhongwei Qiu, Zijiang Yang, Jiaxin Dai, Lingwei Ma, Dawei Zhang
In this paper, we propose a numerical reasoning method for material KGs (NR-KG), which constructs a cross-modal KG using semantic nodes and numerical proxy nodes.
1 code implementation • CVPR 2024 • Zhongwei Ren, Zhicheng Huang, Yunchao Wei, Yao Zhao, Dongmei Fu, Jiashi Feng, Xiaojie Jin
PixelLM excels across various pixel-level image reasoning and understanding tasks, outperforming well-established methods in multiple benchmarks, including MUSE, single- and multi-referring segmentation.
no code implementations • 24 Sep 2023 • Zijiang Yang, Zhongwei Qiu, Chang Xu, Dongmei Fu
3D style transfer aims to generate stylized views of 3D scenes with specified styles, which requires high-quality generating and keeping multi-view consistency.
no code implementations • 29 Jun 2023 • Zhongwei Qiu, Qiansheng Yang, Jian Wang, Xiyu Wang, Chang Xu, Dongmei Fu, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang
One of the mainstream schemes for 2D human pose estimation (HPE) is learning keypoints heatmaps by a neural network.
no code implementations • 22 May 2023 • Xingjian He, Sihan Chen, Fan Ma, Zhicheng Huang, Xiaojie Jin, Zikang Liu, Dongmei Fu, Yi Yang, Jing Liu, Jiashi Feng
Towards this goal, we propose a novel video-text pre-training method dubbed VLAB: Video Language pre-training by feature Adapting and Blending, which transfers CLIP representations to video pre-training tasks and develops unified video multimodal models for a wide range of video-text tasks.
Ranked #1 on
Visual Question Answering (VQA)
on MSVD-QA
(using extra training data)
no code implementations • CVPR 2023 • Zhongwei Qiu, Yang Qiansheng, Jian Wang, Haocheng Feng, Junyu Han, Errui Ding, Chang Xu, Dongmei Fu, Jingdong Wang
To handle the variances of objects as time proceeds, a novel scheme of progressive decoding is used to update pose and shape queries at each frame.
Ranked #32 on
3D Human Pose Estimation
on 3DPW
1 code implementation • 27 Dec 2022 • Zhongwei Qiu, Huan Yang, Jianlong Fu, Daochang Liu, Chang Xu, Dongmei Fu
Video Super-Resolution (VSR) aims to restore high-resolution (HR) videos from low-resolution (LR) videos.
Ranked #4 on
Video Super-Resolution
on REDS4- 4x upscaling
no code implementations • 22 Nov 2022 • Zhongwei Qiu, Kai Qiu, Jianlong Fu, Dongmei Fu
Based on MCPC, we propose a weakly-supervised pre-training (WSP) strategy to distinguish the depth relationship between two points in an image.
no code implementations • 6 Aug 2022 • Zhongwei Qiu, Qiansheng Yang, Jian Wang, Dongmei Fu
In particular, we firstly formulate video frames as a series of instance-guided tokens and each token is in charge of predicting the 3D pose of a human instance.
Ranked #11 on
3D Multi-Person Pose Estimation
on Panoptic
(using extra training data)
1 code implementation • 5 Aug 2022 • Zhongwei Qiu, Huan Yang, Jianlong Fu, Dongmei Fu
First, we divide a video frame into patches, and transform each patch into DCT spectral maps in which each channel represents a frequency band.
Ranked #5 on
Video Super-Resolution
on REDS4- 4x upscaling
1 code implementation • 27 Jul 2022 • Zhicheng Huang, Xiaojie Jin, Chengze Lu, Qibin Hou, Ming-Ming Cheng, Dongmei Fu, Xiaohui Shen, Jiashi Feng
The momentum encoder, fed with the full images, enhances the feature discriminability via contrastive learning with its online counterpart.
no code implementations • 22 Jul 2022 • Zhongwei Qiu, Qiansheng Yang, Jian Wang, Dongmei Fu
Finally, the 3D poses are decoded according to dynamic decoding graphs for each detected person.
3D Multi-Person Pose Estimation (absolute)
3D Multi-Person Pose Estimation (root-relative)
+1
3 code implementations • CVPR 2021 • Zhicheng Huang, Zhaoyang Zeng, Yupan Huang, Bei Liu, Dongmei Fu, Jianlong Fu
As region-based visual features usually represent parts of an image, it is challenging for existing vision-language models to fully understand the semantics from paired natural languages.
Ranked #5 on
Visual Entailment
on SNLI-VE val
1 code implementation • 2 Apr 2020 • Zhicheng Huang, Zhaoyang Zeng, Bei Liu, Dongmei Fu, Jianlong Fu
We aim to build a more accurate and thorough connection between image pixels and language semantics directly from image and sentence pairs instead of using region-based image features as the most recent vision and language tasks.
no code implementations • 13 Sep 2018 • Tao Yang, Georgios Arvanitidis, Dongmei Fu, Xiaogang Li, Søren Hauberg
Deep generative models are tremendously successful in learning low-dimensional latent representations that well-describe the data.