SSNCSE_NLP@LT-EDI-ACL2022: Speech Recognition for Vulnerable Individuals in Tamil using pre-trained XLSR models

LTEDI (ACL) 2022 · Dhanya Srinivasan, Bharathi B, Thenmozhi Durairaj, Senthil Kumar B ·

Automatic speech recognition is a tool used to transform human speech into a written form. It is used in a variety of avenues, such as in voice commands, customer, service and more. It has emerged as an essential tool in the digitisation of daily life. It has been known to be of vital importance in making the lives of elderly and disabled people much easier. In this paper we describe an automatic speech recognition model, determined by using three pre-trained models, fine-tuned from the Facebook XLSR Wav2Vec2 model, which was trained using the Common Voice Dataset. The best model for speech recognition in Tamil is determined by finding the word error rate of the data. This work explains the submission made by SSNCSE_NLP in the shared task organized by LT-EDI at ACL 2022. A word error rate of 39.4512 is achieved.

PDF Abstract