no code implementations • 23 Jul 2023 • Ivan Vallés-Pérez, Grzegorz Beringer, Piotr Bilinski, Gary Cook, Roberto Barra-Chicote
We train a CLIP-based model with the aim to learn shared representations of phonetic and acoustic spaces.