no code implementations • 14 Jun 2024 • Vishwanath Pratap Singh, Federico Malato, Ville Hautamaki, Md. Sahidullah, Tomi Kinnunen
While automatic speech recognition (ASR) greatly benefits from data augmentation, the augmentation recipes themselves tend to be heuristic.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
1 code implementation • 7 Jun 2024 • Federico Malato, Ville Hautamaki
Imitation learning enables autonomous agents to learn from human examples, without the need for a reward signal.
no code implementations • 29 Jan 2024 • Federco Malato, Florian Leopold, Andrew Melnik, Ville Hautamaki
Behavioral cloning uses a dataset of demonstrations to learn a policy.
no code implementations • 15 Jun 2023 • Federico Malato, Florian Leopold, Ville Hautamaki, Andrew Melnik
Actions from a selected similar situation can be performed by the agent until representations of the agent's current situation and the selected experience diverge in the latent space.
1 code implementation • 3 Oct 2021 • Yi Ma, Kong Aik Lee, Ville Hautamaki, Haizhou Li
Speech enhancement aims to improve the perceptual quality of the speech signal by suppression of the background noise.
1 code implementation • 28 Sep 2021 • Khaled Hechmi, Trung Ngo Trong, Ville Hautamaki, Tomi Kinnunen
VoxCeleb datasets are widely used in speaker recognition studies.
no code implementations • 13 Jun 2020 • Abrhalei Tela, Abraham Woubie, Ville Hautamaki
Thus, using XLNet language model, we demonstrate competitive performance with mBERT and a pre-trained target language model on the cross-lingual sentiment (CLS) dataset and on a new sentiment analysis dataset for low-resourced language Tigrinya.
no code implementations • 10 May 2019 • Abraham Woubie, Anssi Kanervisto, Janne Karttunen, Ville Hautamaki
In this work, we propose the use of audio as complementary information to visual only in state representation.
no code implementations • 16 Apr 2019 • Kong Aik Lee, Ville Hautamaki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickael Rouvier, Pierre-Michel Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Hector Delgado, Jose Patino, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda, Trung Ngo Trong, Md Sahidullah, Fan Lu, Yun Tang, Ming Tu, Kah Kuan Teh, Huy Dat Tran, Kuruvachan K. George, Ivan Kukanov, Florent Desnous, Jichen Yang, Emre Yilmaz, Longting Xu, Jean-Francois Bonastre, Cheng-Lin Xu, Zhi Hao Lim, Eng Siong Chng, Shivesh Ranjan, John H. L. Hansen, Massimiliano Todisco, Nicholas Evans
The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE).