Search Results for author: Shlomo Dubnov

Found 28 papers, 14 papers with code

Music Enhancement with Deep Filters: A Technical Report for The ICASSP 2024 Cadenza Challenge

no code implementations • 17 Apr 2024 • Keren Shao, Ke Chen, Shlomo Dubnov

In this challenge, we disentangle the deep filters from the original DeepfilterNet and incorporate them into our Spec-UNet-based network to further improve a hybrid Demucs (hdemucs) based remixing pipeline.

Paper
Add Code

Evaluating Co-Creativity using Total Information Flow

no code implementations • 9 Feb 2024 • Vignesh Gokul, Chris Francis, Shlomo Dubnov

We propose a method to compute the information flow using pre-trained generative models as entropy estimators.

Paper
Add Code

PosCUDA: Position based Convolution for Unlearnable Audio Datasets

no code implementations • 4 Jan 2024 • Vignesh Gokul, Shlomo Dubnov

Recent works such as CUDA propose solutions to this problem by adding class-wise blurs to make datasets unlearnable, i. e a model can never use the acquired dataset for learning.

Position

Paper
Add Code

Equipping Pretrained Unconditional Music Transformers with Instrument and Genre Controls

no code implementations • 21 Nov 2023 • Weihan Xu, Julian McAuley, Shlomo Dubnov, Hao-Wen Dong

We then propose a simple technique to equip this pretrained unconditional music transformer model with instrument and genre controls by finetuning the model with additional control tokens.

Music Generation

Paper
Add Code

SelfVC: Voice Conversion With Iterative Refinement using Self Transformations

no code implementations • 14 Oct 2023 • Paarth Neekhara, Shehzeen Hussain, Rafael Valle, Boris Ginsburg, Rishabh Ranjan, Shlomo Dubnov, Farinaz Koushanfar, Julian McAuley

In this work, instead of explicitly disentangling attributes with loss terms, we present a framework to train a controllable voice conversion model on entangled speech representations derived from self-supervised learning (SSL) and speaker verification models.

Self-Supervised Learning Speaker Verification +2

Paper
Add Code

Towards Improving Harmonic Sensitivity and Prediction Stability for Singing Melody Extraction

1 code implementation • 4 Aug 2023 • Keren Shao, Ke Chen, Taylor Berg-Kirkpatrick, Shlomo Dubnov

In deep learning research, many melody extraction models rely on redesigning neural network architectures to improve performance.

Melody Extraction

Paper
Code

MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies

1 code implementation • 3 Aug 2023 • Ke Chen, Yusong Wu, Haohe Liu, Marianna Nezhurina, Taylor Berg-Kirkpatrick, Shlomo Dubnov

Diffusion models have shown promising results in cross-modal generation tasks, including text-to-image and text-to-audio generation.

Audio Generation Data Augmentation +2

127

Paper
Code

Multitrack Music Transformer

2 code implementations • 14 Jul 2022 • Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley, Taylor Berg-Kirkpatrick

Existing approaches for generating multitrack music with transformer models have been limited in terms of the number of instruments, the length of the music segments and slow inference.

Music Generation

129

Paper
Code

TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music

1 code implementation • 2 Feb 2022 • Ke Chen, Shuai Yu, Cheng-i Wang, Wei Li, Taylor Berg-Kirkpatrick, Shlomo Dubnov

In this paper, we propose TONet, a plug-and-play model that improves both tone and octave perceptions by leveraging a novel input representation and a novel network architecture.

Decoder Information Retrieval +3

Paper
Code

HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection

1 code implementation • 2 Feb 2022 • Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov

To combat these problems, we introduce HTS-AT: an audio transformer with a hierarchical structure to reduce the model size and training time.

Ranked #4 on Sound Event Detection on DESED

Audio Classification Event Detection +3

306

Paper
Code

Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data

1 code implementation • 15 Dec 2021 • Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov

Our approach uses a single model for source separation of multiple sound types, and relies solely on weakly-labeled data for training.

Ranked #1 on Audio Source Separation on AudioSet

Audio Source Separation Audio Tagging +3

166

Paper
Code

Zero-shot Audio Source Separation through Query-based Learningfrom Weakly-labeled Data

no code implementations • AAAI 2021 • Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov

Our approach uses a single model for source separation of multiple sound types, and relies solely on weakly-labeled data for training.

Audio Source Separation Event Detection +2

Paper
Add Code

Towards Cross-Cultural Analysis using Music Information Dynamics

no code implementations • 24 Nov 2021 • Shlomo Dubnov, Kevin Huang, Cheng-i Wang

The framework is based on an Music Information Dynamics model, a Variable Markov Oracle (VMO), and is extended with a variational representation learning of audio.

Representation Learning

Paper
Add Code

Comparison and Analysis of Deep Audio Embeddings for Music Emotion Recognition

no code implementations • 13 Apr 2021 • Eunjeong Koh, Shlomo Dubnov

We implement several multi-class classifiers with deep audio embeddings to predict emotion semantics in music.

Emotion Recognition Feature Engineering +1

Paper
Add Code

Bias-Free FedGAN: A Federated Approach to Generate Bias-Free Datasets

no code implementations • 17 Mar 2021 • Vaikkunth Mugunthan, Vignesh Gokul, Lalana Kagal, Shlomo Dubnov

Our approach generates metadata at the aggregator using the models received from clients and retrains the federated model to achieve bias-free results for image synthesis.

Generative Adversarial Network Image Generation

Paper
Add Code

WaveGuard: Understanding and Mitigating Audio Adversarial Examples

1 code implementation • 4 Mar 2021 • Shehzeen Hussain, Paarth Neekhara, Shlomo Dubnov, Julian McAuley, Farinaz Koushanfar

There has been a recent surge in adversarial attacks on deep learning based automatic speech recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Cross-modal Adversarial Reprogramming

1 code implementation • 15 Feb 2021 • Paarth Neekhara, Shehzeen Hussain, Jinglong Du, Shlomo Dubnov, Farinaz Koushanfar, Julian McAuley

Recent works on adversarial reprogramming have shown that it is possible to repurpose neural networks for alternate tasks without modifying the network architecture or parameters.

Classification General Classification +1

Paper
Code

Deep Music Information Dynamics

no code implementations • 1 Feb 2021 • Shlomo Dubnov

In this paper we introduce a novel framework that we call Deep Musical Information Dynamics, which combines two parallel streams - a low rate latent representation stream that is assumed to capture the dynamics of a thought process contrasted with a higher rate information dynamics derived from the musical data itself.

Paper
Add Code

Expressive Neural Voice Cloning

no code implementations • 30 Jan 2021 • Paarth Neekhara, Shehzeen Hussain, Shlomo Dubnov, Farinaz Koushanfar, Julian McAuley

In this work, we propose a controllable voice cloning method that allows fine-grained control over various style aspects of the synthesized speech for an unseen speaker.

Speech Synthesis Style Transfer +1

Paper
Add Code

DPD-InfoGAN: Differentially Private Distributed InfoGAN

no code implementations • 22 Oct 2020 • Vaikkunth Mugunthan, Vignesh Gokul, Lalana Kagal, Shlomo Dubnov

The Information Maximizing GAN (InfoGAN) is a variant of the default GAN that introduces feature-control variables that are automatically learned by the framework, hence providing greater control over the different kinds of images produced.

Privacy Preserving

Paper
Add Code

Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythm

1 code implementation • 4 Aug 2020 • Ke Chen, Cheng-i Wang, Taylor Berg-Kirkpatrick, Shlomo Dubnov

Drawing an analogy with automatic image completion systems, we propose Music SketchNet, a neural network framework that allows users to specify partial musical ideas guiding automatic music generation.

Music Generation

Paper
Code

Continuous Melody Generation via Disentangled Short-Term Representations and Structural Conditions

1 code implementation • 5 Feb 2020 • Ke Chen, Gus Xia, Shlomo Dubnov

Automatic music generation is an interdisciplinary research topic that combines computational creativity and semantic analysis of music to create automatic machine improvisations.

Disentanglement Music Generation

Paper
Code

Query-based Deep Improvisation

1 code implementation • 21 Jun 2019 • Shlomo Dubnov

In this paper we explore techniques for generating new music using a Variational Autoencoder (VAE) neural network that was trained on a corpus of specific style.

Decoder

Paper
Code

Universal Adversarial Perturbations for Speech Recognition Systems

no code implementations • 9 May 2019 • Paarth Neekhara, Shehzeen Hussain, Prakhar Pandey, Shlomo Dubnov, Julian McAuley, Farinaz Koushanfar

In this work, we demonstrate the existence of universal adversarial audio perturbations that cause mis-transcription of audio signals by automatic speech recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Expediting TTS Synthesis with Adversarial Vocoding

1 code implementation • 16 Apr 2019 • Paarth Neekhara, Chris Donahue, Miller Puckette, Shlomo Dubnov, Julian McAuley

Recent approaches in text-to-speech (TTS) synthesis employ neural network strategies to vocode perceptually-informed spectrogram representations directly into listenable waveforms.

Paper
Code

The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation

1 code implementation • 20 Nov 2018 • Ke Chen, Weilin Zhang, Shlomo Dubnov, Gus Xia, Wei Li

With recent breakthroughs in artificial neural networks, deep generative models have become one of the leading techniques for computational creativity.

Music Generation

Paper
Code

Rethinking Recurrent Latent Variable Model for Music Composition

no code implementations • 7 Oct 2018 • Eunjeong Stella Koh, Shlomo Dubnov, Dustin Wright

Our results suggest that the proposed model has a better statistical resemblance to the musical structure of the training data, which improves the creation of new sequences of music in the style of the originals.

Decoder

Paper
Add Code

Adversarial Reprogramming of Text Classification Neural Networks

1 code implementation • IJCNLP 2019 • Paarth Neekhara, Shehzeen Hussain, Shlomo Dubnov, Farinaz Koushanfar

Adversarial Reprogramming has demonstrated success in utilizing pre-trained neural network classifiers for alternative classification tasks without modification to the original network.

General Classification text-classification +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.