no code implementations • 24 Mar 2024 • Linzhi Wu, Xingyu Zhang, Yakun Zhang, Changyan Zheng, Tiejun Liu, Liang Xie, Ye Yan, Erwei Yin
Lip reading, the process of interpreting silent speech from visual lip movements, has gained rising attention for its wide range of realistic applications.
1 code implementation • ICCV 2023 • Yibo Cui, Liang Xie, Yakun Zhang, Meishan Zhang, Ye Yan, Erwei Yin
To address this problem, we propose a novel Grounded Entity-Landmark Adaptive (GELA) pre-training paradigm for VLN tasks.