no code implementations • 28 Jul 2022 • Zvi Kons, Hagai Aronowitz, Edmilson Morais, Matheus Damasceno, Hong-Kwang Kuo, Samuel Thomas, George Saon
We propose using a recurrent neural network transducer (RNN-T)-based speech-to-text (STT) system as a common component that can be used for emotion recognition and language identification as well as for speech recognition.
no code implementations • 1 Mar 2022 • Hagai Aronowitz, Itai Gat, Edmilson Morais, Weizhong Zhu, Ron Hoory
Beyond that, a common engine should be capable of supporting distributed training with client in-house private data.
no code implementations • ICASSP 2022 • Edmilson Morais, Ron Hoory, Weizhong Zhu, Itai Gat, Matheus Damasceno, Hagai Aronowitz
Self-supervised pre-trained features have consistently delivered state-of-art results in the field of natural language processing (NLP); however, their merits in the field of speech emotion recognition (SER) still need further investigation.
no code implementations • 2 Feb 2022 • Itai Gat, Hagai Aronowitz, Weizhong Zhu, Edmilson Morais, Ron Hoory
Large speech emotion recognition datasets are hard to obtain, and small datasets may contain biases.
Ranked #1 on Speech Emotion Recognition on IEMOCAP (AUC metric)
no code implementations • 28 Jul 2020 • Shai Rozenberg, Hagai Aronowitz, Ron Hoory
With the rise of voice-activated applications, the need for speaker recognition is rapidly increasing.