Search Results for author: Yaru Chen

Found 2 papers, 0 papers with code

Audio-Visual Instance Segmentation

no code implementations28 Oct 2023 Ruohao Guo, Yaru Chen, Yanyu Qi, Wenzhen Yue, Dantong Niu, Xianghua Ying

In this paper, we propose a new multi-modal task, namely audio-visual instance segmentation (AVIS), in which the goal is to identify, segment, and track individual sounding object instances in audible videos, simultaneously.

Instance Segmentation Segmentation +1

CM-PIE: Cross-modal perception for interactive-enhanced audio-visual video parsing

no code implementations11 Oct 2023 Yaru Chen, Ruohao Guo, Xubo Liu, Peipei Wu, Guangyao Li, Zhenbo Li, Wenwu Wang

Audio-visual video parsing is the task of categorizing a video at the segment level with weak labels, and predicting them as audible or visible events.

Cannot find the paper you are looking for? You can Submit a new open access paper.