no code implementations • 20 Mar 2024 • Leanne Nortje, Dan Oneaţă, Yevgen Matusevych, Herman Kamper
To simulate prior acoustic and visual knowledge, we experiment with several initialisation strategies using pretrained speech and vision networks.