Search Results for author: Alkis Koudounas

Found 15 papers, 11 papers with code

DeepDialogue: A Multi-Turn Emotionally-Rich Spoken Dialogue Dataset

no code implementations26 May 2025 Alkis Koudounas, Moreno La Quatra, Elena Baralis

Recent advances in conversational AI have demonstrated impressive capabilities in single-turn responses, yet multi-turn dialogues remain challenging for even the most sophisticated language models.

Philosophy

Exploring Generative Error Correction for Dysarthric Speech Recognition

1 code implementation26 May 2025 Moreno La Quatra, Alkis Koudounas, Valerio Mario Salerno, Sabato Marco Siniscalchi

Despite the remarkable progress in end-to-end Automatic Speech Recognition (ASR) engines, accurately transcribing dysarthric speech remains a major challenge.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

"KAN you hear me?" Exploring Kolmogorov-Arnold Networks for Spoken Language Understanding

1 code implementation26 May 2025 Alkis Koudounas, Moreno La Quatra, Eliana Pastor, Sabato Marco Siniscalchi, Elena Baralis

Kolmogorov-Arnold Networks (KANs) have recently emerged as a promising alternative to traditional neural architectures, yet their application to speech processing remains under explored.

Kolmogorov-Arnold Networks Spoken Language Understanding

MVP: Multi-source Voice Pathology detection

1 code implementation26 May 2025 Alkis Koudounas, Moreno La Quatra, Gabriele Ciravegna, Marco Fantini, Erika Crosetti, Giovanni Succo, Tania Cerquitelli, Sabato Marco Siniscalchi, Elena Baralis

Voice disorders significantly impact patient quality of life, yet non-invasive automated diagnosis remains under-explored due to both the scarcity of pathological voice data, and the variability in recording sources.

Sentence Voice pathology detection

"Alexa, can you forget me?" Machine Unlearning Benchmark in Spoken Language Understanding

1 code implementation21 May 2025 Alkis Koudounas, Claudio Savelli, Flavio Giobergia, Elena Baralis

Machine unlearning, the process of efficiently removing specific information from machine learning models, is a growing area of interest for responsible AI.

Machine Unlearning Spoken Language Understanding

voc2vec: A Foundation Model for Non-Verbal Vocalization

1 code implementation22 Feb 2025 Alkis Koudounas, Moreno La Quatra, Marco Sabato Siniscalchi, Elena Baralis

In this work, we aim to overcome the above shortcoming and propose a novel foundation model, termed voc2vec, specifically designed for non-verbal human data leveraging exclusively open-source non-verbal audio datasets.

model

Speech Analysis of Language Varieties in Italy

1 code implementation22 Jun 2024 Moreno La Quatra, Alkis Koudounas, Elena Baralis, Sabato Marco Siniscalchi

We leverage self-supervised learning models to tackle this task and analyze differences and similarities between Italy's regional languages.

Contrastive Learning Self-Supervised Learning

A Contrastive Learning Approach to Mitigate Bias in Speech Models

1 code implementation20 Jun 2024 Alkis Koudounas, Flavio Giobergia, Eliana Pastor, Elena Baralis

Speech models may be affected by performance imbalance in different population subgroups, raising concerns about fair treatment across these groups.

Contrastive Learning Spoken Language Understanding

Benchmarking Representations for Speech, Music, and Acoustic Events

1 code implementation2 May 2024 Moreno La Quatra, Alkis Koudounas, Lorenzo Vaiani, Elena Baralis, Luca Cagliero, Paolo Garza, Sabato Marco Siniscalchi

Limited diversity in standardized benchmarks for evaluating audio representation learning (ARL) methods may hinder systematic comparison of current methods' capabilities.

Audio Classification Benchmarking +2

Houston we have a Divergence: A Subgroup Performance Analysis of ASR Models

no code implementations31 Mar 2024 Alkis Koudounas, Flavio Giobergia

We identify subgroups of audio recordings based on combinations of these metadata and compute each subgroup's performance (e. g., Word Error Rate) and the difference in performance (''divergence'') w. r. t the overall population.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

MALTO at SemEval-2024 Task 6: Leveraging Synthetic Data for LLM Hallucination Detection

no code implementations1 Mar 2024 Federico Borra, Claudio Savelli, Giacomo Rosso, Alkis Koudounas, Flavio Giobergia

In Natural Language Generation (NLG), contemporary Large Language Models (LLMs) face several challenges, such as generating fluent yet inaccurate outputs and reliance on fluency-centric metrics.

Data Augmentation Hallucination +3

Reconstructing Atmospheric Parameters of Exoplanets Using Deep Learning

no code implementations2 Oct 2023 Flavio Giobergia, Alkis Koudounas, Elena Baralis

Exploring exoplanets has transformed our understanding of the universe by revealing many planetary systems that defy our current understanding.

Deep Learning

ITALIC: An Italian Intent Classification Dataset

1 code implementation14 Jun 2023 Alkis Koudounas, Moreno La Quatra, Lorenzo Vaiani, Luca Colomba, Giuseppe Attanasio, Eliana Pastor, Luca Cagliero, Elena Baralis

Recent large-scale Spoken Language Understanding datasets focus predominantly on English and do not account for language-specific phenomena such as particular phonemes or words in different lects.

Classification intent-classification +4

Cannot find the paper you are looking for? You can Submit a new open access paper.