Search Results for author: Pin Lv

Found 2 papers, 0 papers with code

VILAS: Exploring the Effects of Vision and Language Context in Automatic Speech Recognition

no code implementations31 May 2023 Ziyi Ni, Minglun Han, Feilong Chen, Linghui Meng, Jing Shi, Pin Lv, Bo Xu

In this paper, we first propose ViLaS (Vision and Language into Automatic Speech Recognition), a novel multimodal ASR model based on the continuous integrate-and-fire (CIF) mechanism, which can integrate visual and textual context simultaneously or separately, to facilitate speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Cannot find the paper you are looking for? You can Submit a new open access paper.