Search Results for author: Simon Dixon

Found 30 papers, 12 papers with code

High Resolution Guitar Transcription via Domain Adaptation

no code implementations23 Feb 2024 Xavier Riley, Drew Edwards, Simon Dixon

Focusing on the guitar, we refine this approach to training on score data using a dataset of commercially available score-audio pairs.

Domain Adaptation Music Transcription

MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models

no code implementations9 Feb 2024 Yixiao Zhang, Yukara Ikemiya, Gus Xia, Naoki Murata, Marco Martínez, Wei-Hsiang Liao, Yuki Mitsufuji, Simon Dixon

This paper introduces a novel approach to the editing of music generated by such models, enabling the modification of specific attributes, such as genre, mood and instrument, while maintaining other aspects unchanged.

Music Generation Text-to-Music Generation

A Data-Driven Analysis of Robust Automatic Piano Transcription

no code implementations2 Feb 2024 Drew Edwards, Simon Dixon, Emmanouil Benetos, Akira Maezawa, Yuta Kusaka

Algorithms for automatic piano transcription have improved dramatically in recent years due to new datasets and modeling techniques.

Data Augmentation

Symbolic Music Representations for Classification Tasks: A Systematic Evaluation

1 code implementation5 Sep 2023 huan zhang, Emmanouil Karystinaios, Simon Dixon, Gerhard Widmer, Carlos Eduardo Cancino-Chacón

Music Information Retrieval (MIR) has seen a recent surge in deep learning-based approaches, which often involve encoding symbolic music (i. e., music represented in terms of discrete note events) in an image-like or language like fashion.

Classification Information Retrieval +3

A Comparative Analysis Of Latent Regressor Losses For Singing Voice Conversion

no code implementations27 Feb 2023 Brendan O'Connor, Simon Dixon

We propose an alternative loss component in a loss function that is otherwise well-established among VC tasks, which has been shown to improve our model's SVC performance.

Contrastive Learning Disentanglement +1

Interpreting Song Lyrics with an Audio-Informed Pre-trained Language Model

1 code implementation24 Aug 2022 Yixiao Zhang, Junyan Jiang, Gus Xia, Simon Dixon

Lyric interpretations can help people understand songs and their lyrics quickly, and can also make it easier to manage, retrieve and discover songs efficiently from the growing mass of music archives.

Language Modelling Retrieval

Towards Robust Unsupervised Disentanglement of Sequential Data -- A Case Study Using Music Audio

1 code implementation12 May 2022 Yin-Jyun Luo, Sebastian Ewert, Simon Dixon

In this paper, we show that the vanilla DSAE suffers from being sensitive to the choice of model architecture and capacity of the dynamic latent variables, and is prone to collapse the static latent variable.

Data Augmentation Disentanglement +1

A Convolutional-Attentional Neural Framework for Structure-Aware Performance-Score Synchronization

no code implementations19 Apr 2022 Ruchit Agrawal, Daniel Wolff, Simon Dixon

Our method is also robust to structural differences between the performance and score sequences, which is a common limitation of standard alignment approaches.

Time Series Time Series Analysis

Pitch-Informed Instrument Assignment Using a Deep Convolutional Network with Multiple Kernel Shapes

no code implementations28 Jul 2021 Carlos Lordelo, Emmanouil Benetos, Simon Dixon, Sven Ahlbäck

We also include ablation studies investigating the effects of the use of multiple kernel shapes and comparing different input representations for the audio and the note-related information.

Computational Pronunciation Analysis in Sung Utterances

1 code implementation21 Jun 2021 Emir Demirel, Sven Ahlback, Simon Dixon

Recent automatic lyrics transcription (ALT) approaches focus on building stronger acoustic models or in-domain language models, while the pronunciation aspect is seldom touched upon.

Automatic Lyrics Transcription Sentence

Structure-Aware Audio-to-Score Alignment using Progressively Dilated Convolutional Neural Networks

no code implementations31 Jan 2021 Ruchit Agrawal, Daniel Wolff, Simon Dixon

The identification of structural differences between a music performance and the score is a challenging yet integral step of audio-to-score alignment, an important subtask of music information retrieval.

Information Retrieval Music Information Retrieval +1

Learning Frame Similarity using Siamese networks for Audio-to-Score Alignment

no code implementations15 Nov 2020 Ruchit Agrawal, Simon Dixon

Audio-to-score alignment aims at generating an accurate mapping between a performance audio and the score of a given piece.

Dynamic Time Warping

A Hybrid Approach to Audio-to-Score Alignment

no code implementations28 Jul 2020 Ruchit Agrawal, Simon Dixon

Audio-to-score alignment aims at generating an accurate mapping between a performance audio and the score of a given piece.

Dynamic Time Warping

Automatic Lyrics Transcription using Dilated Convolutional Neural Networks with Self-Attention

2 code implementations13 Jul 2020 Emir Demirel, Sven Ahlback, Simon Dixon

Speech recognition is a well developed research field so that the current state of the art systems are being used in many applications in the software industry, yet as by today, there still does not exist such robust system for the recognition of words and sentences from singing voice.

Automatic Lyrics Transcription speech-recognition

Reliable Local Explanations for Machine Listening

1 code implementation15 May 2020 Saumitra Mishra, Emmanouil Benetos, Bob L. Sturm, Simon Dixon

One way to analyse the behaviour of machine learning models is through local explanations that highlight input features that maximally influence model predictions.

Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling

1 code implementation14 Nov 2019 Daniel Stoller, Mi Tian, Sebastian Ewert, Simon Dixon

In comparison to TCN and Wavenet, our network consistently saves memory and computation time, with speed-ups for training and inference of over 4x in the audio generation experiment in particular, while achieving a comparable performance in all tasks.

Audio Generation Causal Language Modeling +2

Training Generative Adversarial Networks from Incomplete Observations using Factorised Discriminators

1 code implementation ICLR 2020 Daniel Stoller, Sebastian Ewert, Simon Dixon

We apply our method to image generation, image segmentation and audio source separation, and obtain improved performance over a standard GAN when additional incomplete training examples are available.

Audio Source Separation Image Generation +3

GAN-based Generation and Automatic Selection of Explanations for Neural Networks

no code implementations21 Apr 2019 Saumitra Mishra, Daniel Stoller, Emmanouil Benetos, Bob L. Sturm, Simon Dixon

However, this requires a careful selection of hyper-parameters to generate interpretable examples for each neuron of interest, and current methods rely on a manual, qualitative evaluation of each setting, which is prohibitively slow.

Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation

9 code implementations8 Jun 2018 Daniel Stoller, Sebastian Ewert, Simon Dixon

Models for audio source separation usually operate on the magnitude spectrum, which ignores phase information and makes separation performance dependant on hyper-parameters for the spectral front-end.

Audio Source Separation Music Source Separation

Adversarial Semi-Supervised Audio Source Separation applied to Singing Voice Extraction

3 code implementations31 Oct 2017 Daniel Stoller, Sebastian Ewert, Simon Dixon

Based on this idea, we drive the separator towards outputs deemed as realistic by discriminator networks that are trained to tell apart real from separator samples.

Audio Source Separation Data Augmentation +1

Note Value Recognition for Piano Transcription Using Markov Random Fields

no code implementations23 Mar 2017 Eita Nakamura, Kazuyoshi Yoshii, Simon Dixon

This paper presents a statistical method for use in music transcription that can estimate score times of note onsets and offsets from polyphonic MIDI performance signals.

Music Transcription

An End-to-End Neural Network for Polyphonic Piano Music Transcription

1 code implementation7 Aug 2015 Siddharth Sigtia, Emmanouil Benetos, Simon Dixon

We compare performance of the neural network based acoustic models with two popular unsupervised acoustic models.

Language Modelling Music Transcription +2

A Hybrid Recurrent Neural Network For Music Transcription

no code implementations6 Nov 2014 Siddharth Sigtia, Emmanouil Benetos, Nicolas Boulanger-Lewandowski, Tillman Weyde, Artur S. d'Avila Garcez, Simon Dixon

We investigate the problem of incorporating higher-level symbolic score-like information into Automatic Music Transcription (AMT) systems to improve their performance.

Music Transcription

Identifying Cover Songs Using Information-Theoretic Measures of Similarity

no code implementations9 Jul 2014 Peter Foster, Simon Dixon, Anssi Klapuri

This paper investigates methods for quantifying similarity between audio signals, specifically for the task of of cover song detection.

Cover song identification Time Series +1

Sequential Complexity as a Descriptor for Musical Similarity

no code implementations27 Feb 2014 Peter Foster, Matthias Mauch, Simon Dixon

To verify that our descriptors capture musically relevant information, we incorporate our descriptors into similarity rating prediction and song year prediction tasks.

Cannot find the paper you are looking for? You can Submit a new open access paper.