Search Results for author: Joan Serrà

Found 37 papers, 13 papers with code

An Empirical Evaluation of Similarity Measures for Time Series Classification

no code implementations • 16 Jan 2014 • Joan Serrà, Josep Lluis Arcos

In particular, the similarity measure is the most essential ingredient of time series clustering and classification systems.

Classification Clustering +4

Paper
Add Code

Particle swarm optimization for time series motif discovery

no code implementations • 29 Jan 2015 • Joan Serrà, Josep Lluis Arcos

In this article, we propose an innovative standpoint and present a solution coming from it: an anytime multimodal optimization algorithm for time series motif discovery based on particle swarms.

Time Series Time Series Streams

Paper
Add Code

Ranking and significance of variable-length similarity-based time series motifs

no code implementations • 6 Mar 2015 • Joan Serrà, Isabel Serra, Álvaro Corral, Josep Lluis Arcos

Specifically, we find that length-normalized motif dissimilarities still have intrinsic dependencies on the motif length, and that lowest dissimilarities are particularly affected by this dependency.

Time Series Time Series Analysis

Paper
Add Code

A genetic algorithm to discover flexible motifs with support

1 code implementation • 16 Nov 2015 • Joan Serrà, Aleksandar Matic, Josep Luis Arcos, Alexandros Karatzoglou

Finding repeated patterns or motifs in a time series is an important unsupervised task that has still a number of open issues, starting by the definition of motif.

Time Series Time Series Analysis

Paper
Code

SEGAN: Speech Enhancement Generative Adversarial Network

20 code implementations • 28 Mar 2017 • Santiago Pascual, Antonio Bonafonte, Joan Serrà

In contrast to current techniques, we operate at the waveform level, training the model end-to-end, and incorporate 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them.

Generative Adversarial Network Speech Enhancement

1,561

Paper
Code

Hot or not? Forecasting cellular network hot spots using sector performance indicators

no code implementations • 18 Apr 2017 • Joan Serrà, Ilias Leontiadis, Alexandros Karatzoglou, Konstantina Papagiannaki

Our results indicate that, compared to the best baseline, tree-based models can deliver up to 14% better forecasts for regular hot spots and 153% better forecasts for non-regular hot spots.

Paper
Add Code

Practical Processing of Mobile Sensor Data for Continual Deep Learning Predictions

no code implementations • 17 May 2017 • Kleomenis Katevas, Ilias Leontiadis, Martin Pielot, Joan Serrà

We present a practical approach for processing mobile sensor time series data for continual deep learning predictions.

Feature Engineering General Classification +2

Paper
Add Code

Getting deep recommenders fit: Bloom embeddings for sparse binary input/output networks

no code implementations • 13 Jun 2017 • Joan Serrà, Alexandros Karatzoglou

Due to the structure of the data coming from recommendation domains (i. e., one-hot-encoded vectors of item preferences), these algorithms tend to have large input and output dimensionalities that dominate their overall size.

Paper
Add Code

Language and Noise Transfer in Speech Enhancement Generative Adversarial Network

3 code implementations • 18 Dec 2017 • Santiago Pascual, Maruchan Park, Joan Serrà, Antonio Bonafonte, Kang-Hun Ahn

In this work, we present the results of adapting a speech enhancement generative adversarial network by finetuning the generator with small amounts of data.

Generative Adversarial Network Speech Enhancement

373

Paper
Code

Continual Prediction of Notification Attendance with Classical and Deep Network Approaches

no code implementations • 19 Dec 2017 • Kleomenis Katevas, Ilias Leontiadis, Martin Pielot, Joan Serrà

Besides using classical gradient-boosted trees, we demonstrate how to make continual predictions using a recurrent neural network (RNN).

Human-Computer Interaction

Paper
Add Code

Overcoming catastrophic forgetting with hard attention to the task

2 code implementations • ICML 2018 • Joan Serrà, Dídac Surís, Marius Miron, Alexandros Karatzoglou

In this paper, we propose a task-based hard attention mechanism that preserves previous tasks' information without affecting the current task's learning.

Ranked #2 on Continual Learning on 20Newsgroup (10 tasks)

Continual Learning Hard Attention

192

Paper
Code

Towards a universal neural network encoder for time series

no code implementations • 10 May 2018 • Joan Serrà, Santiago Pascual, Alexandros Karatzoglou

We evaluate the performance of the proposed approach on a well-known time series classification benchmark, considering full adaptation, partial adaptation, and no adaptation of the encoder to the new data type.

Time Series Time Series Analysis +1

Paper
Add Code

Assessing the impact of machine intelligence on human behaviour: an interdisciplinary endeavour

no code implementations • 7 Jun 2018 • Emilia Gómez, Carlos Castillo, Vicky Charisi, Verónica Dahl, Gustavo Deco, Blagoj Delipetrev, Nicole Dewandre, Miguel Ángel González-Ballester, Fabien Gouyon, José Hernández-Orallo, Perfecto Herrera, Anders Jonsson, Ansgar Koene, Martha Larson, Ramón López de Mántaras, Bertin Martens, Marius Miron, Rubén Moreno-Bote, Nuria Oliver, Antonio Puertas Gallardo, Heike Schweitzer, Nuria Sebastian, Xavier Serra, Joan Serrà, Songül Tolan, Karina Vold

The workshop gathered an interdisciplinary group of experts to establish the state of the art research in the field and a list of future research challenges to be addressed on the topic of human and machine intelligence, algorithm's potential impact on human cognitive capabilities and decision making, and evaluation and regulation needs.

Decision Making

Paper
Add Code

Whispered-to-voiced Alaryngeal Speech Conversion with Generative Adversarial Networks

3 code implementations • 31 Aug 2018 • Santiago Pascual, Antonio Bonafonte, Joan Serrà, Jose A. Gonzalez

Most methods of voice restoration for patients suffering from aphonia either produce whispered or monotone speech.

Speech Enhancement

373

Paper
Code

Self-Attention Linguistic-Acoustic Decoder

no code implementations • 31 Aug 2018 • Santiago Pascual, Antonio Bonafonte, Joan Serrà

The conversion from text to speech relies on the accurate mapping from linguistic to acoustic symbol sequences, for which current practice employs recurrent statistical models like recurrent neural networks.

Speech Synthesis

Paper
Add Code

Training neural audio classifiers with few data

2 code implementations • 24 Oct 2018 • Jordi Pons, Joan Serrà, Xavier Serra

We investigate supervised learning strategies that improve the training of neural network audio classifiers on small annotated collections.

Acoustic Scene Classification General Classification +2

Paper
Code

Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks

1 code implementation • 6 Apr 2019 • Santiago Pascual, Mirco Ravanelli, Joan Serrà, Antonio Bonafonte, Yoshua Bengio

Learning good representations without supervision is still an open issue in machine learning, and is particularly challenging for speech signals, which are often characterized by long sequences with a complex hierarchical structure.

Ranked #2 on Distant Speech Recognition on DIRHA English WSJ

Distant Speech Recognition

435

Paper
Code

Towards Generalized Speech Enhancement with Generative Adversarial Networks

no code implementations • 6 Apr 2019 • Santiago Pascual, Joan Serrà, Antonio Bonafonte

The speech enhancement task usually consists of removing additive noise or reverberation that partially mask spoken utterances, affecting their intelligibility.

Generative Adversarial Network Speech Enhancement

Paper
Add Code

Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion

3 code implementations • NeurIPS 2019 • Joan Serrà, Santiago Pascual, Carlos Segura

End-to-end models for raw audio generation are a challenge, specially if they have to work with non-parallel data, which is a desirable setup in many situations.

Audio Generation Voice Conversion

497

Paper
Code

Input complexity and out-of-distribution detection with likelihood-based generative models

1 code implementation • ICLR 2020 • Joan Serrà, David Álvarez, Vicenç Gómez, Olga Slizovskaia, José F. Núñez, Jordi Luque

Likelihood-based generative models are a promising resource to detect out-of-distribution (OOD) inputs which could compromise the robustness or reliability of a machine learning system.

Ranked #10 on Anomaly Detection on Unlabeled CIFAR-10 vs CIFAR-100

Anomaly Detection Out-of-Distribution Detection +1

Paper
Code

Accurate and Scalable Version Identification Using Musically-Motivated Embeddings

1 code implementation • 28 Oct 2019 • Furkan Yesiler, Joan Serrà, Emilia Gómez

The version identification (VI) task deals with the automatic detection of recordings that correspond to the same underlying musical piece.

Ranked #4 on Cover song identification on YouTube350

Cover song identification

Paper
Code

SESQA: semi-supervised learning for speech quality assessment

no code implementations • 1 Oct 2020 • Joan Serrà, Jordi Pons, Santiago Pascual

Automatic speech quality assessment is an important, transversal task whose progress is hampered by the scarcity of human annotations, poor generalization to unseen recording conditions, and a lack of flexibility of existing approaches.

Paper
Add Code

Less is more: Faster and better music version identification with embedding distillation

1 code implementation • 7 Oct 2020 • Furkan Yesiler, Joan Serrà, Emilia Gómez

Version identification systems aim to detect different renditions of the same underlying musical composition (loosely called cover songs).

Dimensionality Reduction Retrieval

Paper
Code

Automatic multitrack mixing with a differentiable mixing console of neural audio effects

1 code implementation • 20 Oct 2020 • Christian J. Steinmetz, Jordi Pons, Santiago Pascual, Joan Serrà

Applications of deep learning to automatic multitrack mixing are largely unexplored.

Audio and Speech Processing Sound

109

Paper
Code

Upsampling artifacts in neural audio synthesis

1 code implementation • 27 Oct 2020 • Jordi Pons, Santiago Pascual, Giulio Cengarle, Joan Serrà

We then compare different upsampling layers, showing that nearest neighbor upsamplers can be an alternative to the problematic (but state-of-the-art) transposed and subpixel convolutions which are prone to introduce tonal artifacts.

Audio Signal Processing Audio Synthesis

Paper
Code

Investigating the efficacy of music version retrieval systems for setlist identification

no code implementations • 6 Jan 2021 • Furkan Yesiler, Emilio Molina, Joan Serrà, Emilia Gómez

The setlist identification (SLI) task addresses a music recognition use case where the goal is to retrieve the metadata and timestamps for all the tracks played in live music events.

Retrieval

Paper
Add Code

On tuning consistent annealed sampling for denoising score matching

no code implementations • 8 Apr 2021 • Joan Serrà, Santiago Pascual, Jordi Pons

Score-based generative models provide state-of-the-art quality for image and audio synthesis.

Audio Synthesis Denoising

Paper
Add Code

Assessing Algorithmic Biases for Musical Version Identification

no code implementations • 30 Sep 2021 • Furkan Yesiler, Marius Miron, Joan Serrà, Emilia Gómez

Version identification (VI) systems now offer accurate and scalable solutions for detecting different renditions of a musical composition, allowing the use of these systems in industrial applications and throughout the wider music ecosystem.

Attribute Information Retrieval +2

Paper
Add Code

Upsampling layers for music source separation

no code implementations • 23 Nov 2021 • Jordi Pons, Joan Serrà, Santiago Pascual, Giulio Cengarle, Daniel Arteaga, Davide Scaini

Upsampling artifacts are caused by problematic upsampling layers and due to spectral replicas that emerge while upsampling.

Music Source Separation

Paper
Add Code

On loss functions and evaluation metrics for music source separation

no code implementations • 16 Feb 2022 • Enric Gusó, Jordi Pons, Santiago Pascual, Joan Serrà

We investigate which loss functions provide better separations via benchmarking an extensive set of those for music source separation.

Audio Source Separation Benchmarking +1

Paper
Add Code

Universal Speech Enhancement with Score-based Diffusion

no code implementations • 7 Jun 2022 • Joan Serrà, Santiago Pascual, Jordi Pons, R. Oguz Araz, Davide Scaini

We hope that both our methodology and technical contributions encourage researchers and practitioners to adopt a universal approach to speech enhancement, possibly framing it as a generative task.

Speech Enhancement

Paper
Add Code

Adversarial Permutation Invariant Training for Universal Sound Separation

no code implementations • 21 Oct 2022 • Emilian Postolache, Jordi Pons, Santiago Pascual, Joan Serrà

Universal sound separation consists of separating mixes with arbitrary sounds of different types, and permutation invariant training (PIT) is used to train source agnostic models that do so.

Paper
Add Code

Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation

no code implementations • 23 Oct 2022 • Xiaoyu Liu, Xu Li, Joan Serrà

Single channel target speaker separation (TSS) aims at extracting a speaker's voice from a mixture of multiple talkers given an enrollment utterance of that speaker.

Speaker Identification Speaker Separation

Paper
Add Code

Full-band General Audio Synthesis with Score-based Diffusion

no code implementations • 26 Oct 2022 • Santiago Pascual, Gautam Bhattacharya, Chunghsin Yeh, Jordi Pons, Joan Serrà

Recent works have shown the capability of deep generative models to tackle general audio synthesis from a single label, producing a variety of impulsive, tonal, and environmental sounds.

Audio Synthesis

Paper
Add Code

CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models

no code implementations • 16 Jun 2023 • Hao-Wen Dong, Xiaoyu Liu, Jordi Pons, Gautam Bhattacharya, Santiago Pascual, Joan Serrà, Taylor Berg-Kirkpatrick, Julian McAuley

Our results show the effectiveness of the proposed method, and that the pretrained diffusion prior can reduce the modality transfer gap.

Audio Synthesis

Paper
Add Code

Mono-to-stereo through parametric stereo generation

no code implementations • 26 Jun 2023 • Joan Serrà, Davide Scaini, Santiago Pascual, Daniel Arteaga, Jordi Pons, Jeroen Breebaart, Giulio Cengarle

Generating a stereophonic presentation from a monophonic audio signal is a challenging open task, especially if the goal is to obtain a realistic spatial imaging with a specific panning of sound elements.

Paper
Add Code

GASS: Generalizing Audio Source Separation with Large-scale Data

no code implementations • 29 Sep 2023 • Jordi Pons, Xiaoyu Liu, Santiago Pascual, Joan Serrà

Here, we study a single general audio source separation (GASS) model trained to separate speech, music, and sound events in a supervised fashion with a large-scale dataset.

Audio Source Separation Speech Separation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.