no code implementations • CVPR 2025 • Feiyu Pan, Hao Fang, Fangkai Li, Yanyu Xu, Yawei Li, Luca Benini, Xiankai Lu
Referring video object segmentation (RVOS) seeks to segment the objects within a video referred by linguistic expressions.
Instance Segmentation
Referring Video Object Segmentation
+2
no code implementations • 9 Sep 2024 • Henghui Ding, Lingyi Hong, Chang Liu, Ning Xu, Linjie Yang, Yuchen Fan, Deshui Miao, Yameng Gu, Xin Li, Zhenyu He, YaoWei Wang, Ming-Hsuan Yang, Jinming Chai, Qin Ma, Junpei Zhang, Licheng Jiao, Fang Liu, Xinyu Liu, Jing Zhang, Kexin Zhang, Xu Liu, Lingling Li, Hao Fang, Feiyu Pan, Xiankai Lu, Wei zhang, Runmin Cong, Tuyen Tran, Bin Cao, Yisi Zhang, Hanyi Wang, Xingjian He, Jing Liu
Despite the promising performance of current video segmentation models on existing benchmarks, these models still struggle with complex scenes.
no code implementations • 19 Aug 2024 • Hao Fang, Feiyu Pan, Xiankai Lu, Wei zhang, Runmin Cong
Referring video object segmentation (RVOS) relies on natural language expressions to segment target objects in video.
Referring Video Object Segmentation
Semantic Segmentation
+1
no code implementations • 19 Aug 2024 • Feiyu Pan, Hao Fang, Runmin Cong, Wei zhang, Xiankai Lu
Video Object Segmentation (VOS) task aims to segmenting a particular object instance throughout the entire video sequence given only the object mask of the first frame.
2 code implementations • 24 Jun 2024 • Henghui Ding, Chang Liu, Yunchao Wei, Nikhila Ravi, Shuting He, Song Bai, Philip Torr, Deshui Miao, Xin Li, Zhenyu He, YaoWei Wang, Ming-Hsuan Yang, Zhensong Xu, Jiangtao Yao, Chengjing Wu, Ting Liu, Luoqi Liu, Xinyu Liu, Jing Zhang, Kexin Zhang, Yuting Yang, Licheng Jiao, Shuyuan Yang, Mingqi Gao, Jingnan Luo, Jinyu Yang, Jungong Han, Feng Zheng, Bin Cao, Yisi Zhang, Xuanxu Lin, Xingjian He, Bo Zhao, Jing Liu, Feiyu Pan, Hao Fang, Xiankai Lu
Moreover, we provide a new motion expression guided video segmentation dataset MeViS to study the natural language-guided video understanding in complex environments.
no code implementations • 7 Jun 2024 • Feiyu Pan, Hao Fang, Xiankai Lu
The current RVOS methods typically use independently pre-trained vision and language models as backbones, resulting in a significant domain gap between video and text.
Referring Video Object Segmentation
Semantic Segmentation
+2