no code implementations • 27 Sep 2024 • Soroosh Tayebi Arasteh, Mahshad Lotfinia, Paula Andrea Perez-Toro, Tomas Arias-Vergara, Mahtab Ranji, Juan Rafael Orozco-Arroyave, Maria Schuster, Andreas Maier, Seung Hee Yang
To generalize our findings across languages and disorders, we validated our approach on a dataset of Spanish-speaking Parkinson's disease patients, leveraging pretrained models from healthy English-speaking datasets, and demonstrated that careful pretraining on large-scale task-specific datasets can maintain favorable accuracy under DP constraints.
1 code implementation • 3 Jul 2024 • Tobias Weise, Philipp Klumpp, Kubilay Can Demir, Paula Andrea Pérez-Toro, Maria Schuster, Elmar Noeth, Bjoern Heismann, Andreas Maier, Seung Hee Yang
This paper introduces a novel combination of two tasks, previously treated separately: acoustic-to-articulatory speech inversion (AAI) and phoneme-to-articulatory (PTA) motion estimation.
no code implementations • 17 Jun 2024 • Kubilay Can Demir, Belen Lojo Rodriguez, Tobias Weise, Andreas Maier, Seung Hee Yang
To develop intelligent speech assistants and integrate them seamlessly with intra-operative decision-support frameworks, accurate and efficient surgical phase recognition is a prerequisite.
1 code implementation • 11 Apr 2024 • Soroosh Tayebi Arasteh, Tomas Arias-Vergara, Paula Andrea Perez-Toro, Tobias Weise, Kai Packhaeuser, Maria Schuster, Elmar Noeth, Andreas Maier, Seung Hee Yang
This study investigates anonymization's impact on pathological speech across over 2, 700 speakers from multiple German institutions, focusing on privacy, pathological utility, and demographic fairness.
no code implementations • 18 May 2023 • Soroosh Tayebi Arasteh, Cristian David Rios-Urrego, Elmar Noeth, Andreas Maier, Seung Hee Yang, Jan Rusz, Juan Rafael Orozco-Arroyave
Parkinson's disease (PD) is a neurological disorder impacting a person's speech.
no code implementations • 24 Jun 2022 • Kubilay Can Demir, Matthias May, Axel Schmid, Michael Uder, Katharina Breininger, Tobias Weise, Andreas Maier, Seung Hee Yang
This paper presents a new multimodal interventional radiology dataset, called PoCaP (Port Catheter Placement) Corpus.
1 code implementation • 13 Apr 2022 • Soroosh Tayebi Arasteh, Tobias Weise, Maria Schuster, Elmar Noeth, Andreas Maier, Seung Hee Yang
Navigating the challenges of data-driven speech processing, one of the primary hurdles is accessing reliable pathological speech data.
1 code implementation • 8 Apr 2022 • Tobias Weise, Philipp Klumpp, Kubilay Can Demir, Andreas Maier, Elmar Noeth, Bjoern Heismann, Maria Schuster, Seung Hee Yang
Our results are among the first to show that disentangled speech representations can be used for automatic pathological speech intelligibility assessment, resulting in a reference speaker pair invariant method, applicable in scenarios with only few utterances available.
no code implementations • 4 Apr 2022 • Abner Hernandez, Paula Andrea Pérez-Toro, Juan Camilo Vásquez-Correa, Juan Rafael Orozco-Arroyave, Andreas Maier, Seung Hee Yang
Collecting speech data is an important step in training speech recognition systems and other speech-based machine learning models.
no code implementations • 4 Apr 2022 • Abner Hernandez, Paula Andrea Pérez-Toro, Elmar Nöth, Juan Rafael Orozco-Arroyave, Andreas Maier, Seung Hee Yang
Compared to using Fbank features, XLSR-based features reduced WERs by 6. 8%, 22. 0%, and 7. 0% for the UASpeech, PC-GITA, and EasyCall corpus, respectively.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
1 code implementation • 7 Feb 2022 • Aline Sindel, Abner Hernandez, Seung Hee Yang, Vincent Christlein, Andreas Maier
With the increasing number of online learning material in the web, search for specific content in lecture videos can be time consuming.
no code implementations • 6 Dec 2021 • Andreas Maier, Seung Hee Yang, Farhad Maleki, Nikesh Muthukrishnan, Reza Forghani
In the domain of medical image processing, medical device manufacturers protect their intellectual property in many cases by shipping only compiled software, i. e. binary code which can be executed but is difficult to be understood by a potential attacker.
no code implementations • 10 Aug 2021 • Andreas Maier, Harald Köstler, Marco Heisig, Patrick Krauss, Seung Hee Yang
In this article, we perform a review of the state-of-the-art of hybrid machine learning in medical imaging.
no code implementations • 10 Jan 2020 • Sihyeon Jo, Sangwon Im, SangWook Han, Seung Hee Yang, Hee-Eun Kim, Seong-Woo Kim
The development of natural language processing algorithms and the explosive growth of conversational data are encouraging researches on the human-computer conversation.
no code implementations • 10 Jan 2020 • Seung Hee Yang, Minhwa Chung
Dysarthria is a motor speech impairment affecting millions of people.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 20 Apr 2019 • Seung Hee Yang, Minhwa Chung
Trained on 97, 200 spectrogram images of short utterances produced by native and non-native speakers of Korean, the generator is able to successfully transform the non-native spectrogram input to a spectrogram with properties of self-imitating feedback.