However, little work has been done for game image captioning which has some unique characteristics and requirements.
The mechanism proposed here is for real-time speaker change detection in conversations, which firstly trains a neural network text-independent speaker classifier using in-domain speaker data.
This work presents a novel framework based on feed-forward neural network for text-independent speaker classification and verification, two related systems of speaker recognition.
Speech recognition, especially name recognition, is widely used in phone services such as company directory dialers, stock quote providers or location finders.
This paper presents a method for detecting mispronunciations with the aim of improving Computer Assisted Language Learning (CALL) tools used by foreign language learners.
Various algorithms for text-independent speaker recognition have been developed through the decades, aiming to improve both accuracy and efficiency.
New adaptive features have been developed and obtained through an adaptive warping of the frequency scale prior to computing the cepstral coefficients.
Previous accent classification research focused mainly on detecting accents with pure acoustic information without recognizing accented speech.
In this paper, we present a novel setup of a Neural Network Language Model (NNLM) and apply it to a database of text samples from different authors.
Researches have shown accent classification can be improved by integrating semantic information into pure acoustic approach.