no code implementations • 8 Jul 2024 • Fei Guo, Yikang Wang, Han Qi, Li Zhu, Jing Sun
In the first branch, a Domain Temporal Encoder is employed to capture temporal features for both the source and target domains.
no code implementations • 16 Jan 2024 • Fei Guo, Yikang Wang, Han Qi, Wenping Jin, Li Zhu
In each view, we fuse the prompt embedding as consistent information with visual and the global or local temporal context to overcome the overlapping distribution of classes and outliers.
no code implementations • 2 Dec 2023 • Fei Guo, Li Zhu, Yikang Wang, Han Qi
Although some multi-modal works use labels as supplementary to construct prototypes of support videos, they can not use this information for query videos.
no code implementations • 20 Aug 2023 • Zexin Cai, Weiqing Wang, Yikang Wang, Ming Li
This paper introduces our system designed for Track 2, which focuses on locating manipulated regions, in the second Audio Deepfake Detection Challenge (ADD 2023).
no code implementations • 15 Jul 2022 • Xingming Wang, Xiaoyi Qin, Yikang Wang, Yunfei Xu, Ming Li
For CM systems, we propose two methods on top of the challenge baseline to further improve the performance, namely Embedding Random Sampling Augmentation (ERSA) and One-Class Confusion Loss(OCCL).