Search Results for author: Shlomo Dubnov

Found 28 papers, 14 papers with code

Music Enhancement with Deep Filters: A Technical Report for The ICASSP 2024 Cadenza Challenge

no code implementations17 Apr 2024 Keren Shao, Ke Chen, Shlomo Dubnov

In this challenge, we disentangle the deep filters from the original DeepfilterNet and incorporate them into our Spec-UNet-based network to further improve a hybrid Demucs (hdemucs) based remixing pipeline.

Evaluating Co-Creativity using Total Information Flow

no code implementations9 Feb 2024 Vignesh Gokul, Chris Francis, Shlomo Dubnov

We propose a method to compute the information flow using pre-trained generative models as entropy estimators.

PosCUDA: Position based Convolution for Unlearnable Audio Datasets

no code implementations4 Jan 2024 Vignesh Gokul, Shlomo Dubnov

Recent works such as CUDA propose solutions to this problem by adding class-wise blurs to make datasets unlearnable, i. e a model can never use the acquired dataset for learning.

Position

Equipping Pretrained Unconditional Music Transformers with Instrument and Genre Controls

no code implementations21 Nov 2023 Weihan Xu, Julian McAuley, Shlomo Dubnov, Hao-Wen Dong

We then propose a simple technique to equip this pretrained unconditional music transformer model with instrument and genre controls by finetuning the model with additional control tokens.

Music Generation

SelfVC: Voice Conversion With Iterative Refinement using Self Transformations

no code implementations14 Oct 2023 Paarth Neekhara, Shehzeen Hussain, Rafael Valle, Boris Ginsburg, Rishabh Ranjan, Shlomo Dubnov, Farinaz Koushanfar, Julian McAuley

In this work, instead of explicitly disentangling attributes with loss terms, we present a framework to train a controllable voice conversion model on entangled speech representations derived from self-supervised learning and speaker verification models.

Self-Supervised Learning Speaker Verification +2

Towards Improving Harmonic Sensitivity and Prediction Stability for Singing Melody Extraction

1 code implementation4 Aug 2023 Keren Shao, Ke Chen, Taylor Berg-Kirkpatrick, Shlomo Dubnov

In deep learning research, many melody extraction models rely on redesigning neural network architectures to improve performance.

Melody Extraction

Multitrack Music Transformer

2 code implementations14 Jul 2022 Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley, Taylor Berg-Kirkpatrick

Existing approaches for generating multitrack music with transformer models have been limited in terms of the number of instruments, the length of the music segments and slow inference.

Music Generation

TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music

1 code implementation2 Feb 2022 Ke Chen, Shuai Yu, Cheng-i Wang, Wei Li, Taylor Berg-Kirkpatrick, Shlomo Dubnov

In this paper, we propose TONet, a plug-and-play model that improves both tone and octave perceptions by leveraging a novel input representation and a novel network architecture.

Information Retrieval Melody Extraction +2

HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection

1 code implementation2 Feb 2022 Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov

To combat these problems, we introduce HTS-AT: an audio transformer with a hierarchical structure to reduce the model size and training time.

Audio Classification Event Detection +3

Towards Cross-Cultural Analysis using Music Information Dynamics

no code implementations24 Nov 2021 Shlomo Dubnov, Kevin Huang, Cheng-i Wang

The framework is based on an Music Information Dynamics model, a Variable Markov Oracle (VMO), and is extended with a variational representation learning of audio.

Representation Learning

Comparison and Analysis of Deep Audio Embeddings for Music Emotion Recognition

no code implementations13 Apr 2021 Eunjeong Koh, Shlomo Dubnov

We implement several multi-class classifiers with deep audio embeddings to predict emotion semantics in music.

Emotion Recognition Feature Engineering +1

Bias-Free FedGAN: A Federated Approach to Generate Bias-Free Datasets

no code implementations17 Mar 2021 Vaikkunth Mugunthan, Vignesh Gokul, Lalana Kagal, Shlomo Dubnov

Our approach generates metadata at the aggregator using the models received from clients and retrains the federated model to achieve bias-free results for image synthesis.

Generative Adversarial Network Image Generation

Cross-modal Adversarial Reprogramming

1 code implementation15 Feb 2021 Paarth Neekhara, Shehzeen Hussain, Jinglong Du, Shlomo Dubnov, Farinaz Koushanfar, Julian McAuley

Recent works on adversarial reprogramming have shown that it is possible to repurpose neural networks for alternate tasks without modifying the network architecture or parameters.

Classification General Classification +1

Deep Music Information Dynamics

no code implementations1 Feb 2021 Shlomo Dubnov

In this paper we introduce a novel framework that we call Deep Musical Information Dynamics, which combines two parallel streams - a low rate latent representation stream that is assumed to capture the dynamics of a thought process contrasted with a higher rate information dynamics derived from the musical data itself.

Expressive Neural Voice Cloning

no code implementations30 Jan 2021 Paarth Neekhara, Shehzeen Hussain, Shlomo Dubnov, Farinaz Koushanfar, Julian McAuley

In this work, we propose a controllable voice cloning method that allows fine-grained control over various style aspects of the synthesized speech for an unseen speaker.

Speech Synthesis Style Transfer +1

DPD-InfoGAN: Differentially Private Distributed InfoGAN

no code implementations22 Oct 2020 Vaikkunth Mugunthan, Vignesh Gokul, Lalana Kagal, Shlomo Dubnov

The Information Maximizing GAN (InfoGAN) is a variant of the default GAN that introduces feature-control variables that are automatically learned by the framework, hence providing greater control over the different kinds of images produced.

Privacy Preserving

Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythm

1 code implementation4 Aug 2020 Ke Chen, Cheng-i Wang, Taylor Berg-Kirkpatrick, Shlomo Dubnov

Drawing an analogy with automatic image completion systems, we propose Music SketchNet, a neural network framework that allows users to specify partial musical ideas guiding automatic music generation.

Music Generation

Continuous Melody Generation via Disentangled Short-Term Representations and Structural Conditions

1 code implementation5 Feb 2020 Ke Chen, Gus Xia, Shlomo Dubnov

Automatic music generation is an interdisciplinary research topic that combines computational creativity and semantic analysis of music to create automatic machine improvisations.

Disentanglement Music Generation

Query-based Deep Improvisation

1 code implementation21 Jun 2019 Shlomo Dubnov

In this paper we explore techniques for generating new music using a Variational Autoencoder (VAE) neural network that was trained on a corpus of specific style.

Universal Adversarial Perturbations for Speech Recognition Systems

no code implementations9 May 2019 Paarth Neekhara, Shehzeen Hussain, Prakhar Pandey, Shlomo Dubnov, Julian McAuley, Farinaz Koushanfar

In this work, we demonstrate the existence of universal adversarial audio perturbations that cause mis-transcription of audio signals by automatic speech recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Expediting TTS Synthesis with Adversarial Vocoding

1 code implementation16 Apr 2019 Paarth Neekhara, Chris Donahue, Miller Puckette, Shlomo Dubnov, Julian McAuley

Recent approaches in text-to-speech (TTS) synthesis employ neural network strategies to vocode perceptually-informed spectrogram representations directly into listenable waveforms.

The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation

1 code implementation20 Nov 2018 Ke Chen, Weilin Zhang, Shlomo Dubnov, Gus Xia, Wei Li

With recent breakthroughs in artificial neural networks, deep generative models have become one of the leading techniques for computational creativity.

Music Generation

Rethinking Recurrent Latent Variable Model for Music Composition

no code implementations7 Oct 2018 Eunjeong Stella Koh, Shlomo Dubnov, Dustin Wright

Our results suggest that the proposed model has a better statistical resemblance to the musical structure of the training data, which improves the creation of new sequences of music in the style of the originals.

Adversarial Reprogramming of Text Classification Neural Networks

1 code implementation IJCNLP 2019 Paarth Neekhara, Shehzeen Hussain, Shlomo Dubnov, Farinaz Koushanfar

Adversarial Reprogramming has demonstrated success in utilizing pre-trained neural network classifiers for alternative classification tasks without modification to the original network.

General Classification text-classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.