2 code implementations • 8 Feb 2024 • Mohammed Muqeeth, Haokun Liu, Yufan Liu, Colin Raffel
Unlike past methods that learn to route among specialized models, PHATGOOSE explores the possibility that zero-shot generalization will be improved if different experts can be adaptively chosen for each token and at each layer in the model.
no code implementations • 27 Dec 2023 • Xun Guo, Mingwu Zheng, Liang Hou, Yuan Gao, Yufan Deng, Pengfei Wan, Di Zhang, Yufan Liu, Weiming Hu, ZhengJun Zha, Haibin Huang, Chongyang Ma
I2V-Adapter adeptly propagates the unnoised input image to subsequent noised frames through a cross-frame attention mechanism, maintaining the identity of the input image without any changes to the pretrained T2V model.
1 code implementation • 28 Jun 2023 • Yufan Liu, Boxue Tian
We trained the CLAPE-DB model on the protein-DNA binding sites dataset and evaluated the model performance and generalization ability through various experiments.
no code implementations • CVPR 2023 • Weiming Bai, Yufan Liu, Zhipeng Zhang, Bing Li, Weiming Hu
Observing that face manipulation may alter the relation between different facial action units (AU), we propose the Action Units Relation Learning framework to improve the generality of forgery detection.
no code implementations • 12 Jul 2022 • Yufan Liu, Jiajiong Cao, Bing Li, Weiming Hu, Jingting Ding, Liang Li
However, most existing knowledge distillation methods only consider homologous-architecture distillation, such as distilling knowledge from CNN to CNN.
1 code implementation • 5 Nov 2021 • Minglang Qiao, Yufan Liu, Mai Xu, Xin Deng, Bing Li, Weiming Hu, Ali Borji
In this paper, we propose a multitask learning method for visual-audio saliency prediction and sound source localization on multi-face video by leveraging visual, audio and face information.
no code implementations • 18 Sep 2021 • Zekun Li, Yufan Liu, Bing Li, Weiming Hu, Kebin Wu, Pei Wang
CDI builds the global attention and interaction among different levels in decoupled space which also solves the problem of heavy computation.
1 code implementation • ECCV 2020 • Yufan Liu, Minglang Qiao, Mai Xu, Bing Li, Weiming Hu, Ali Borji
Inspired by the findings of our investigation, we propose a novel multi-modal video saliency model consisting of three branches: visual, audio and face.
no code implementations • 16 Nov 2020 • Zekun Li, Yufan Liu, Bing Li, Weiming Hu
Furthermore, these two components are both plug-and-play and can be embedded in any backbone.
1 code implementation • CVPR 2019 • Yufan Liu, Jiajiong Cao, Bing Li, Chunfeng Yuan, Weiming Hu, Yangxi Li, Yunqiang Duan
The key challenge of knowledge distillation is to extract general, moderate and sufficient knowledge from a teacher network to guide a student network.
1 code implementation • CVPR 2017 • Yufan Liu, Songyang Zhang, Mai Xu, Xuming He
On the other hand, we find that the attention of different subjects consistently focuses on a single face in each frame of videos involving multiple faces.