1 code implementation • 15 Oct 2019 • Alejandro Cartas, Jordi Luque, Petia Radeva, Carlos Segura, Mariella Dimiccoli
Our interaction with the world is an inherently multimodal experience.
1 code implementation • 16 Apr 2021 • Biel Tura, Santiago Escuder, Ferran Diego, Carlos Segura, Jordi Luque
This work explores the application of Lambda networks, an alternative framework for capturing long-range interactions without attention, for the keyword spotting task.
1 code implementation • 27 Oct 2022 • Fernando López, Jordi Luque
The alignments are computed iteratively upon a corpus of broadcast TV.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 17 Oct 2023 • Fernando López, Jordi Luque, Carlos Segura, Pablo Gómez
It employs two models: a lightweight on-device model for real-time processing of the audio stream and a verification model on the server-side, which is an ensemble of heterogeneous architectures that refine detection.
1 code implementation • ICLR 2020 • Joan Serrà, David Álvarez, Vicenç Gómez, Olga Slizovskaia, José F. Núñez, Jordi Luque
Likelihood-based generative models are a promising resource to detect out-of-distribution (OOD) inputs which could compromise the robustness or reliability of a machine learning system.
Ranked #10 on Anomaly Detection on Unlabeled CIFAR-10 vs CIFAR-100
no code implementations • 9 Oct 2016 • Ivan Gonzalez Torre, Bartolo Luque, Lucas Lacasa, Jordi Luque, Antoni Hernandez-Fernandez
This means that inferences of statistical patterns of language in acoustics are biased by the arbitrary, language-dependent segmentation of the signal, and virtually precludes the possibility of making comparative studies between human voice and other animal communication systems.
no code implementations • 5 Aug 2014 • Jordi Luque, Bartolo Luque, Lucas Lacasa
We first show that during speech the energy is unevenly released and power-law distributed, reporting a universal robust Gutenberg-Richter-like law in speech.
no code implementations • 3 Jun 2019 • Alejandro Cartas, Jordi Luque, Petia Radeva, Carlos Segura, Mariella Dimiccoli
Sounds are an important source of information on our daily interactions with objects.
no code implementations • 12 Nov 2019 • Guillermo Cámbara, Jordi Luque, Mireia Farrús
The use of photoplethysmogram signal (PPG) for heart and sleep monitoring is commonly found nowadays in smartphones and wrist wearables.
no code implementations • 1 Jun 2020 • Benet Oriol, Jordi Luque, Ferran Diego, Xavier Giro-i-Nieto
In this work, we propose an effective approach for training unique embedding representations by combining three simultaneous modalities: image and spoken and textual narratives.
no code implementations • 2 Sep 2020 • Guillermo Cámbara, Jordi Luque, Mireia Farrús
Up to our knowledge, this is the first work combining such pitch and voice quality features with modern convolutional architectures, showing improvements up to 7% and 3% relative WER points, for the publicly available Spanish Common Voice and LibriSpeech 100h datasets, respectively.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 29 Jan 2021 • David Bonet, Guillermo Cámbara, Fernando López, Pablo Gómez, Carlos Segura, Jordi Luque
Keyword spotting and in particular Wake-Up-Word (WUW) detection is a very important task for voice assistants.
no code implementations • 29 Jan 2021 • Martin Kocour, Guillermo Cámbara, Jordi Luque, David Bonet, Mireia Farrús, Martin Karafiát, Karel Veselý, Jan ''Honza'' Ĉernocký
This paper describes joint effort of BUT and Telef\'onica Research on development of Automatic Speech Recognition systems for Albayzin 2020 Challenge.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 9 May 2021 • Guillermo Cámbara, Alex Peiró-Lilja, Mireia Farrús, Jordi Luque
Nowadays, research in speech technologies has gotten a lot out thanks to recently created public domain corpora that contain thousands of recording hours.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 20 Sep 2021 • Joan Codina-Filbà, Guillermo Cámbara, Jordi Luque, Mireia Farrús
Furthermore, we study the influence of a language model -- which tends to correct non-standard sequences of words -- with the lack of language model to decode the hypothesis from the ASR.
no code implementations • SIGDIAL (ACL) 2020 • Ramiro H. Gálvez, Lara Gauder, Jordi Luque, Agustín Gravano
Acoustic/prosodic (a/p) entrainment has been associated with multiple positive social aspects of human-human conversations.
no code implementations • 21 Dec 2021 • Guillermo Cámbara, Jordi Luque, Mireia Farrús
Jitter and shimmer measurements have shown to be carriers of voice quality and prosodic information which enhance the performance of tasks like speaker recognition, diarization or automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 14 Jul 2022 • Rodolfo Zevallos, Nuria Bel, Guillermo Cámbara, Mireia Farrús, Jordi Luque
In this paper we describe our data augmentation approach to improve the results of ASR models for low-resource and agglutinative languages.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • COLING 2022 • Guillermo Cámbara, Jordi Luque, Mireia Farrús
We do so by transferring the codebook as weights for the latent bottleneck of a Keyword Spotting Perceiver, thus initializing such model with phonetic embeddings already.
no code implementations • 31 Jan 2023 • Gabriele Castellano, Juan-José Nieto, Jordi Luque, Ferrán Diego, Carlos Segura, Diego Perino, Flavio Esposito, Fulvio Risso, Aravindh Raman
Many real-time applications (e. g., Augmented/Virtual Reality, cognitive assistance) rely on Deep Neural Networks (DNNs) to process inference tasks.