Search Results for author: Carlos Segura

Found 8 papers, 4 papers with code

Robust Wake-Up Word Detection by Two-stage Multi-resolution Ensembles

1 code implementation • 17 Oct 2023 • Fernando López, Jordi Luque, Carlos Segura, Pablo Gómez

It employs two models: a lightweight on-device model for real-time processing of the audio stream and a verification model on the server-side, which is an ensemble of heterogeneous architectures that refine detection.

Paper
Code

Scheduling Inference Workloads on Distributed Edge Clusters with Reinforcement Learning

no code implementations • 31 Jan 2023 • Gabriele Castellano, Juan-José Nieto, Jordi Luque, Ferrán Diego, Carlos Segura, Diego Perino, Flavio Esposito, Fulvio Risso, Aravindh Raman

Many real-time applications (e. g., Augmented/Virtual Reality, cognitive assistance) rely on Deep Neural Networks (DNNs) to process inference tasks.

Edge-computing Management +3

Paper
Add Code

Efficient Keyword Spotting by capturing long-range interactions with Temporal Lambda Networks

1 code implementation • 16 Apr 2021 • Biel Tura, Santiago Escuder, Ferran Diego, Carlos Segura, Jordi Luque

This work explores the application of Lambda networks, an alternative framework for capturing long-range interactions without attention, for the keyword spotting task.

Keyword Spotting speech-recognition +1

Paper
Code

Speech Enhancement for Wake-Up-Word detection in Voice Assistants

no code implementations • 29 Jan 2021 • David Bonet, Guillermo Cámbara, Fernando López, Pablo Gómez, Carlos Segura, Jordi Luque

Keyword spotting and in particular Wake-Up-Word (WUW) detection is a very important task for voice assistants.

Data Augmentation Denoising +2

Paper
Add Code

Enabling Zero-shot Multilingual Spoken Language Translation with Language-Specific Encoders and Decoders

no code implementations • 2 Nov 2020 • Carlos Escolano, Marta R. Costa-jussà, José A. R. Fonollosa, Carlos Segura

On the other hand, Multilingual Neural Machine Translation (MultiNMT) approaches rely on higher-quality and more massive data sets.

Machine Translation Translation

Paper
Add Code

Seeing and Hearing Egocentric Actions: How Much Can We Learn?

1 code implementation • 15 Oct 2019 • Alejandro Cartas, Jordi Luque, Petia Radeva, Carlos Segura, Mariella Dimiccoli

Our interaction with the world is an inherently multimodal experience.

Action Recognition

Paper
Code

Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion

3 code implementations • NeurIPS 2019 • Joan Serrà, Santiago Pascual, Carlos Segura

End-to-end models for raw audio generation are a challenge, specially if they have to work with non-parallel data, which is a desirable setup in many situations.

Audio Generation Voice Conversion

497

Paper
Code

How Much Does Audio Matter to Recognize Egocentric Object Interactions?

no code implementations • 3 Jun 2019 • Alejandro Cartas, Jordi Luque, Petia Radeva, Carlos Segura, Mariella Dimiccoli

Sounds are an important source of information on our daily interactions with objects.

Action Classification Action Recognition +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.