1 code implementation • 2 May 2023 • Yuanzheng Ma, Wangting Zhou, Rui Ma, Sihua Yang, Yansong Tang, Xun Guan
To address this challenge, we propose a novel approach that employs a super-resolution PAA method trained with forged PAA images.
1 code implementation • 23 Mar 2023 • Xiaoke Huang, Yiji Cheng, Yansong Tang, Xiu Li, Jie zhou, Jiwen Lu
Moreover, only minutes of optimization is enough for plausible reconstruction results.
no code implementations • 16 Mar 2023 • Kunyang Han, Yong liu, Jun Hao Liew, Henghui Ding, Yunchao Wei, Jiajun Liu, Yitong Wang, Yansong Tang, Yujiu Yang, Jiashi Feng, Yao Zhao
Recent advancements in pre-trained vision-language models, such as CLIP, have enabled the segmentation of arbitrary concepts solely from textual inputs, a process commonly referred to as open-vocabulary semantic segmentation (OVS).
Knowledge Distillation
Open Vocabulary Semantic Segmentation
+3
no code implementations • 11 Mar 2023 • Zhao Yang, Jiaqi Wang, Yansong Tang, Kai Chen, Hengshuang Zhao, Philip H. S. Torr
Referring image segmentation segments an image from a language expression.
no code implementations • CVPR 2023 • Shiyi Zhang, Wenxun Dai, Sujia Wang, Xiangwei Shen, Jiwen Lu, Jie zhou, Yansong Tang
Action quality assessment (AQA) has become an emerging topic since it can be extensively applied in numerous scenarios.
no code implementations • CVPR 2023 • Yansong Tang, Jinpeng Liu, Aoyang Liu, Bin Yang, Wenxun Dai, Yongming Rao, Jiwen Lu, Jie zhou, Xiu Li
With the continuously thriving popularity around the world, fitness activity analytic has become an emerging research topic in computer vision.
1 code implementation • 11 Oct 2022 • Yong liu, Ran Yu, Jiahao Wang, Xinyuan Zhao, Yitong Wang, Yansong Tang, Yujiu Yang
Besides, we empirically find low frequency feature should be enhanced in encoder (backbone) while high frequency for decoder (segmentation head).
Semantic Segmentation
Semi-Supervised Video Object Segmentation
+1
4 code implementations • 28 Jul 2022 • Yongming Rao, Wenliang Zhao, Yansong Tang, Jie zhou, Ser-Nam Lim, Jiwen Lu
In this paper, we show that the key ingredients behind the vision Transformers, namely input-adaptive, long-range and high-order spatial interactions, can also be efficiently implemented with a convolution-based framework.
Ranked #19 on
Semantic Segmentation
on ADE20K
1 code implementation • 17 Jul 2022 • Yansong Tang, Xingyu Liu, Xumin Yu, Danyang Zhang, Jiwen Lu, Jie zhou
Different from the conventional adversarial learning-based approaches for UDA, we utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
1 code implementation • 6 Jun 2022 • Wanhua Li, Xiaoke Huang, Zheng Zhu, Yansong Tang, Xiu Li, Jie zhou, Jiwen Lu
In this paper, we propose to learn the rank concepts from the rich semantic CLIP latent space.
1 code implementation • CVPR 2022 • Kejie Li, Yansong Tang, Victor Adrian Prisacariu, Philip H. S. Torr
Dense 3D reconstruction from a stream of depth images is the key to many mixed reality and robotic applications.
2 code implementations • 21 Mar 2022 • Rui Yang, Hailong Ma, Jie Wu, Yansong Tang, Xuefeng Xiao, Min Zheng, Xiu Li
The vanilla self-attention mechanism inherently relies on pre-defined and steadfast computational dimensions.
no code implementations • CVPR 2022 • Donglai Wei, Siddhant Kharbanda, Sarthak Arora, Roshan Roy, Nishant Jain, Akash Palrecha, Tanav Shah, Shray Mathur, Ritik Mathur, Abhijay Kemkar, Anirudh Chakravarthy, Zudi Lin, Won-Dong Jang, Yansong Tang, Song Bai, James Tompkin, Philip H.S. Torr, Hanspeter Pfister
Many video understanding tasks require analyzing multi-shot videos, but existing datasets for video object segmentation (VOS) only consider single-shot videos.
1 code implementation • CVPR 2022 • Guangrun Wang, Yansong Tang, Liang Lin, Philip H.S. Torr
Inspired by perceptual learning that could use cross-view learning to perceive concepts and semantics, we propose a novel AE that could learn semantic-aware representation via cross-view image reconstruction.
1 code implementation • CVPR 2022 • Zhao Yang, Jiaqi Wang, Yansong Tang, Kai Chen, Hengshuang Zhao, Philip H. S. Torr
Referring image segmentation is a fundamental vision-language task that aims to segment out an object referred to by a natural language expression from an image.
Ranked #3 on
Referring Expression Segmentation
on RefCOCOg-test
1 code implementation • CVPR 2022 • Yongming Rao, Wenliang Zhao, Guangyi Chen, Yansong Tang, Zheng Zhu, Guan Huang, Jie zhou, Jiwen Lu
In this work, we present a new framework for dense prediction by implicitly and explicitly leveraging the pre-trained knowledge from CLIP.
no code implementations • British Machine Vision Conference 2021 • Zhao Yang, Yansong Tang, Luca Bertinetto, Hengshuang Zhao, Philip Torr
In this paper, we investigate the problem of video object segmentation from referring expressions (VOSRE).
Ranked #1 on
Referring Expression Segmentation
on J-HMDB
(Precision@0.9 metric)
Optical Flow Estimation
Referring Expression Segmentation
+3
no code implementations • 19 Jul 2021 • Jiahuan Zhou, Yansong Tang, Bing Su, Ying Wu
We justify that the performance limitation is caused by the gradient vanishing on these sample outliers.
1 code implementation • 12 May 2021 • Yansong Tang, Zhenyu Jiang, Zhenda Xie, Yue Cao, Zheng Zhang, Philip H. S. Torr, Han Hu
Previous cycle-consistency correspondence learning methods usually leverage image patches for training.
1 code implementation • CVPR 2020 • Yansong Tang, Zanlin Ni, Jiahuan Zhou, Danyang Zhang, Jiwen Lu, Ying Wu, Jie zhou
Assessing action quality from videos has attracted growing attention in recent years.
Ranked #4 on
Action Quality Assessment
on AQA-7
no code implementations • 20 Mar 2020 • Yansong Tang, Jiwen Lu, Jie zhou
We believe the introduction of the COIN dataset will promote the future in-depth research on instructional video analysis for the community.
no code implementations • CVPR 2019 • Yansong Tang, Dajun Ding, Yongming Rao, Yu Zheng, Danyang Zhang, Lili Zhao, Jiwen Lu, Jie zhou
There are substantial instructional videos on the Internet, which enables us to acquire knowledge for completing various tasks.
no code implementations • CVPR 2018 • Yansong Tang, Yi Tian, Jiwen Lu, Peiyang Li, Jie zhou
In this paper, we propose a deep progressive reinforcement learning (DPRL) method for action recognition in skeleton-based videos, which aims to distil the most informative frames and discard ambiguous frames in sequences for recognizing actions.
Ranked #3 on
Skeleton Based Action Recognition
on UT-Kinect