no code implementations • 13 Jan 2025 • Mathias Rose Bjare, Giorgia Cantisani, Stefan Lattner, Gerhard Widmer
In modeling musical surprisal expectancy with computational methods, it has been proposed to use the information content (IC) of one-step predictions from an autoregressive model as a proxy for surprisal in symbolic music.
1 code implementation • 14 Sep 2024 • Florian Schmid, Tobias Morocutti, Francesco Foscarin, Jan Schlüter, Paul Primus, Gerhard Widmer
For five transformers, we obtain a substantial performance improvement over previously available checkpoints both on AudioSet frame-level predictions and on frame-level sound event detection downstream tasks, confirming our pipeline's effectiveness.
1 code implementation • 21 Aug 2024 • Jonathan Greif, Florian Schmid, Paul Primus, Gerhard Widmer
Query-by-Vocal Imitation (QBV) is about searching audio files within databases using vocal imitations created by the user's voice.
1 code implementation • 21 Aug 2024 • Paul Primus, Florian Schmid, Gerhard Widmer
We evaluate our method on the ClothoV2 and the AudioCaps benchmark and show that it improves retrieval performance, even in a restricting self-distillation setting where a single model generates and then learns from the estimated correspondences.
Ranked #1 on
Text to Audio Retrieval
on Clotho
(using extra training data)
1 code implementation • 12 Aug 2024 • Mathias Rose Bjare, Stefan Lattner, Gerhard Widmer
This enables the comparison of surprisal across different musical content even if the musical events occur in irregular time intervals.
1 code implementation • 8 Aug 2024 • Lukáš Samuel Marták, Patricia Hu, Gerhard Widmer
Automatic Music Transcription (AMT) is the task of recognizing notes in audio recordings of music.
2 code implementations • 8 Aug 2024 • Silvan David Peter, Gerhard Widmer
Note alignment refers to the task of matching individual notes of two versions of the same symbolically encoded piece.
1 code implementation • 17 Jul 2024 • Florian Schmid, Paul Primus, Tobias Morocutti, Jonathan Greif, Gerhard Widmer
We fine-tune three large Audio Spectrogram Transformers, PaSST, BEATs, and ATST, on the joint DESED and MAESTRO datasets in a two-stage training procedure.
1 code implementation • 17 Jul 2024 • Emmanouil Karystinaios, Gerhard Widmer
Addressing this gap, we present GraphMuse, a graph processing framework and library that facilitates efficient music graph processing and GNN training for symbolic music tasks.
1 code implementation • 17 Jul 2024 • Florian Schmid, Paul Primus, Tobias Morocutti, Jonathan Greif, Gerhard Widmer
A single model and an ensemble, both based on our proposed training procedure, ranked first in Task 4 of the DCASE Challenge 2024.
1 code implementation • 15 Jul 2024 • Francesco Foscarin, Emmanouil Karystinaios, Eita Nakamura, Gerhard Widmer
To aid the qualitative analysis of our results, we support the export in symbolic music formats and provide a direct visualization of our outputs graph over the musical score.
no code implementations • 22 Jun 2024 • Paul Primus, Gerhard Widmer
Matching raw audio signals with textual descriptions requires understanding the audio's content and the description's semantics and then drawing connections between the two modalities.
1 code implementation • 21 Jun 2024 • huan zhang, Shreyan Chowdhury, Carlos Eduardo Cancino-Chacón, Jinhua Liang, Simon Dixon, Gerhard Widmer
The perceptual-feature-conditioned generation and transferring capabilities of DExter are verified by a proxy model predicting perceptual characteristics of differently steered performances.
1 code implementation • 16 May 2024 • Florian Schmid, Paul Primus, Toni Heittola, Annamaria Mesaros, Irene Martín-Morató, Khaled Koutini, Gerhard Widmer
This article describes the Data-Efficient Low-Complexity Acoustic Scene Classification Task in the DCASE 2024 Challenge and the corresponding baseline system.
1 code implementation • 15 May 2024 • Emmanouil Karystinaios, Francesco Foscarin, Gerhard Widmer
We propose a new graph convolutional block, called MusGConv, specifically designed for the efficient processing of musical score data and motivated by general perceptual principles.
no code implementations • 26 Jan 2024 • Shreyan Chowdhury, Gerhard Widmer
On the text side, we use emotion-enriched word embeddings (EWE) and on the audio side, we extract mid-level perceptual features instead of generic audio embeddings.
1 code implementation • 31 Dec 2023 • Silvan David Peter, Shreyan Chowdhury, Carlos Eduardo Cancino-Chacón, Gerhard Widmer
Using a music research dataset of free text performance characterizations and a follow-up study sorting the annotations into clusters, we derive a ground truth for a domain-specific semantic similarity structure.
1 code implementation • 24 Oct 2023 • Florian Schmid, Khaled Koutini, Gerhard Widmer
Audio Spectrogram Transformers are excellent at exploiting large datasets, creating powerful pre-trained models that surpass CNNs when fine-tuned on downstream tasks.
Ranked #1 on
Instrument Recognition
on OpenMIC-2018
(using extra training data)
no code implementations • 21 Sep 2023 • Luis Carvalho, Gerhard Widmer
A range of applications of multi-modal music information retrieval is centred around the problem of connecting large collections of sheet music (images) to corresponding audio recordings, that is, identifying pairs of audio and score excerpts that refer to the same musical content.
no code implementations • 21 Sep 2023 • Luis Carvalho, Gerhard Widmer
Many applications of cross-modal music retrieval are related to connecting sheet music images to audio recordings.
no code implementations • 21 Sep 2023 • Luis Carvalho, Tobias Washüttl, Gerhard Widmer
We then conclude by arguing for the potential of self-supervised contrastive learning for alleviating the annotated data scarcity in multi-modal music retrieval models.
1 code implementation • 5 Sep 2023 • huan zhang, Emmanouil Karystinaios, Simon Dixon, Gerhard Widmer, Carlos Eduardo Cancino-Chacón
Music Information Retrieval (MIR) has seen a recent surge in deep learning-based approaches, which often involve encoding symbolic music (i. e., music represented in terms of discrete note events) in an image-like or language like fashion.
no code implementations • 18 Aug 2023 • Mathias Rose Bjare, Stefan Lattner, Gerhard Widmer
To accomplish this, we train a high-capacity transformer model on a vast collection of highly-structured Irish folk melodies and analyze the musical qualities of the samples generated using distribution truncation sampling techniques.
1 code implementation • 8 Aug 2023 • Paul Primus, Khaled Koutini, Gerhard Widmer
This work presents a text-to-audio-retrieval system based on pre-trained text and spectrogram transformers.
Ranked #4 on
Text to Audio Retrieval
on Clotho
(using extra training data)
1 code implementation • 7 Jul 2023 • Emmanouil Karystinaios, Gerhard Widmer
Roman Numeral analysis is the important task of identifying chords and their functional context in pieces of tonal music.
1 code implementation • 29 Jun 2023 • Francesco Foscarin, Daniel Harasim, Gerhard Widmer
This paper describes a data-driven framework to parse musical sequences into dependency trees, which are hierarchical structures used in music cognition research and music analysis.
no code implementations • 20 Jun 2023 • Paul Primus and, Gerhard Widmer
Based on this observation, we conjecture that suppressing recording device characteristics in the input spectrogram is the most effective.
1 code implementation • 13 Jun 2023 • Shahed Masoudian, Khaled Koutini, Markus Schedl, Gerhard Widmer, Navid Rekabsaz
In the Acoustic Scene Classification task (ASC), domain shift is mainly caused by different recording devices.
1 code implementation • 16 May 2023 • Matthias Plasser, Silvan Peter, Gerhard Widmer
Denoising Diffusion Probabilistic Models (DDPMs) have made great strides in generating high-quality samples in both discrete and continuous domains.
1 code implementation • 12 May 2023 • Tobias Morocutti, Florian Schmid, Khaled Koutini, Gerhard Widmer
However, we also show that DIR augmentation and Freq-MixStyle are complementary, achieving a new state-of-the-art performance on signals recorded by devices unseen during training.
1 code implementation • 28 Apr 2023 • Emmanouil Karystinaios, Francesco Foscarin, Gerhard Widmer
Our approach builds a graph from a musical piece, by creating one node for every note, and separates the melodic trajectories by predicting a link between two notes if they are consecutive in the same voice/stream.
no code implementations • 28 Nov 2022 • Lukáš Samuel Marták, Rainer Kelz, Gerhard Widmer
We introduce a novel way to incorporate prior information into (semi-) supervised non-negative matrix factorization, which we call differentiable dictionary search.
no code implementations • 28 Nov 2022 • Lukáš Samuel Marták, Rainer Kelz, Gerhard Widmer
This paper describes several improvements to a new method for signal decomposition that we recently formulated under the name of Differentiable Dictionary Search (DDS).
1 code implementation • 25 Nov 2022 • Khaled Koutini, Shahed Masoudian, Florian Schmid, Hamid Eghbal-zadeh, Jan Schlüter, Gerhard Widmer
Furthermore, we will show that transformers trained on Audioset can be extremely effective representation extractors for a wide range of downstream tasks.
2 code implementations • 9 Nov 2022 • Florian Schmid, Khaled Koutini, Gerhard Widmer
We provide models of different complexity levels, scaling from low-complexity models up to a new state-of-the-art performance of . 483 mAP on AudioSet.
Ranked #2 on
Audio Tagging
on AudioSet
(using extra training data)
no code implementations • 31 Aug 2022 • Emmanouil Karystinaios, Gerhard Widmer
In this work, we present a graph representation of symbolic scores as an intermediate means to solve the cadence detection task.
no code implementations • 26 Aug 2022 • Francesco Foscarin, Katharina Hoedt, Verena Praher, Arthur Flexer, Gerhard Widmer
Current approaches for explaining deep learning systems applied to musical data provide results in a low-level feature space, e. g., by highlighting potentially relevant time-frequency bins in a spectrogram or time-pitch bins in a piano roll.
no code implementations • 24 Aug 2022 • Paul Primus, Gerhard Widmer
Standard machine learning models for tagging and classifying acoustic signals cannot handle classes that were not seen during training.
no code implementations • 24 Aug 2022 • Paul Primus, Gerhard Widmer
The absence of large labeled datasets remains a significant challenge in many application areas of deep learning.
1 code implementation • 24 May 2022 • Katharina Hoedt, Arthur Flexer, Gerhard Widmer
Adversarial attacks can drastically degrade performance of recommenders and other machine learning systems, resulting in an increased demand for defence mechanisms.
1 code implementation • 12 Nov 2021 • Florian Henkel, Stephanie Schwaiger, Gerhard Widmer
We present a prototype of an automatic page turning system that works directly on real scores, i. e., sheet images, without any symbolic representation.
2 code implementations • 11 Oct 2021 • Khaled Koutini, Jan Schlüter, Hamid Eghbal-zadeh, Gerhard Widmer
However, one of the main shortcomings of transformer models, compared to the well-established CNNs, is the computational complexity.
Ranked #3 on
Audio Classification
on FSD50K
(using extra training data)
no code implementations • NLP4MusA 2021 • Charles Brazier, Gerhard Widmer
Fully automatic opera tracking is challenging because of the acoustic complexity of the genre, combining musical and linguistic information (singing, speech) in complex ways.
no code implementations • NeurIPS Workshop ICBINB 2021 • Rainer Kelz, Gerhard Widmer
We cast the combinatorial problem of polyphonic piano transcription as a two stage process.
no code implementations • 30 Jul 2021 • Charles Brazier, Gerhard Widmer
Audio-to-lyrics alignment has become an increasingly active research task in MIR, supported by the emergence of several open-source datasets of audio recordings with word-level lyrics annotations.
no code implementations • 19 Jul 2021 • Khaled Koutini, Hamid Eghbal-zadeh, Florian Henkel, Jan Schlüter, Gerhard Widmer
Convolutional Neural Networks (CNNs) have been dominating classification tasks in various domains, such as machine vision, machine listening, and natural language processing.
1 code implementation • 19 Jul 2021 • Verena Praher, Katharina Prinz, Arthur Flexer, Gerhard Widmer
The basic idea is to identify a small set of human-understandable features of the classified example that are most influential on the classifier's prediction.
2 code implementations • 14 Jun 2021 • Shreyan Chowdhury, Verena Praher, Gerhard Widmer
In previous work, we have shown how to derive explanations of model predictions in terms of spectrogram image segments that connect to the high-level emotion prediction via a layer of easily interpretable perceptual features.
1 code implementation • 26 May 2021 • Khaled Koutini, Hamid Eghbal-zadeh, Gerhard Widmer
As state-of-the-art CNN architectures-in computer vision and other domains-tend to go deeper in terms of number of layers, their RF size increases and therefore they degrade in performance in several audio classification and tagging tasks.
no code implementations • 26 May 2021 • Luis Carvalho, Gerhard Widmer
This paper addresses the problem of cross-modal musical piece identification and retrieval: finding the appropriate recording(s) from a database given a sheet music query, and vice versa, working directly with audio and scanned sheet music images.
no code implementations • 18 May 2021 • Charles Brazier, Gerhard Widmer
Algorithms for reliable real-time score following in live opera promise a lot of useful applications such as automatic subtitles display, or real-time video cutting in live streaming.
1 code implementation • 10 May 2021 • Florian Henkel, Gerhard Widmer
This paper addresses the problem of sheet-image-based on-line audio-to-score alignment also known as score following.
no code implementations • ICLR Workshop SSL-RL 2021 • Hamid Eghbal-zadeh, Florian Henkel, Gerhard Widmer
In Contextual Reinforcement Learning (CRL), a change in the context variable can cause a change in the distribution of the states.
no code implementations • 26 Feb 2021 • Shreyan Chowdhury, Gerhard Widmer
Emotion and expressivity in music have been topics of considerable interest in the field of music information retrieval.
1 code implementation • 5 Nov 2020 • Paul Primus, Verena Haunschmid, Patrick Praher, Gerhard Widmer
If no data with similar sounds and matching recording conditions is available, data sets with a larger diversity in these two dimensions are preferable.
1 code implementation • 5 Nov 2020 • Khaled Koutini, Florian Henkel, Hamid Eghbal-zadeh, Gerhard Widmer
Deep Neural Networks are known to be very demanding in terms of computing and memory requirements.
1 code implementation • 4 Sep 2020 • Verena Haunschmid, Ethan Manilow, Gerhard Widmer
Prior work on explainable models in MIR has generally used image processing tools to produce explanations for DNN predictions, but these are not necessarily musically meaningful, or can be listened to (which, arguably, is important in music).
no code implementations • 14 Aug 2020 • Katharina Prinz, Arthur Flexer, Gerhard Widmer
We explore how much can be learned from noisy labels in audio music tagging.
no code implementations • 5 Aug 2020 • Carlos Cancino-Chacón, Silvan Peter, Shreyan Chowdhury, Anna Aljanaki, Gerhard Widmer
In this paper, we offer a first account of this new data resource for expressive performance research, and provide an exploratory analysis, addressing three main questions: (1) how similarly do different listeners describe a performance of a piece?
2 code implementations • 2 Aug 2020 • Verena Haunschmid, Ethan Manilow, Gerhard Widmer
Deep neural networks (DNNs) are successfully applied in a wide variety of music information retrieval (MIR) tasks but their predictions are usually not interpretable.
1 code implementation • 27 Jul 2020 • Khaled Koutini, Hamid Eghbal-zadeh, Verena Haunschmid, Paul Primus, Shreyan Chowdhury, Gerhard Widmer
However, the MIR field is still dominated by the classical VGG-based CNN architecture variants, often in combination with more complex modules such as attention, and/or techniques such as pre-training on large datasets.
1 code implementation • 21 Jul 2020 • Florian Henkel, Rainer Kelz, Gerhard Widmer
This paper addresses the task of score following in sheet music given as unprocessed images.
no code implementations • 6 Jul 2020 • Hamid Eghbal-zadeh, Khaled Koutini, Paul Primus, Verena Haunschmid, Michal Lewandowski, Werner Zellinger, Bernhard A. Moser, Gerhard Widmer
Data augmentation techniques have become standard practice in deep learning, as it has been shown to greatly improve the generalisation abilities of models.
no code implementations • 27 Jun 2020 • David R. W. Sears, Gerhard Widmer
Recurrent voice-leading patterns like the Mi-Re-Do compound cadence (MRDCC) rarely appear on the musical surface in complex polyphonic textures, so finding these patterns using computational methods remains a tremendous challenge.
no code implementations • 19 Jun 2020 • Charles Brazier, Gerhard Widmer
Recent advances in real-time music score following have made it possible for machines to automatically track highly complex polyphonic music, including full orchestra performances.
1 code implementation • 28 Oct 2019 • Khaled Koutini, Shreyan Chowdhury, Verena Haunschmid, Hamid Eghbal-zadeh, Gerhard Widmer
We present CP-JKU submission to MediaEval 2019; a Receptive Field-(RF)-regularized and Frequency-Aware CNN approach for tagging music with emotion/mood labels.
1 code implementation • 16 Oct 2019 • Florian Henkel, Rainer Kelz, Gerhard Widmer
The goal of score following is to track a musical performance, usually in the form of audio, in a corresponding score representation.
no code implementations • 16 Oct 2019 • Thassilo Gadermaier, Gerhard Widmer
We then carry out a systematic evaluation of different audio features for audio-to-audio alignment, quantifying the degree of alignment accuracy that can be achieved, and relate this to the results from the annotation study.
2 code implementations • 5 Sep 2019 • Khaled Koutini, Hamid Eghbal-zadeh, Gerhard Widmer
One side effect of restricting the RF of CNNs is that more frequency information is lost.
1 code implementation • 4 Sep 2019 • Paul Primus, Hamid Eghbal-zadeh, David Eitelsebner, Khaled Koutini, Andreas Arzt, Gerhard Widmer
Distribution mismatches between the data seen at training and at application time remain a major challenge in all application areas of machine learning.
no code implementations • 8 Jul 2019 • Shreyan Chowdhury, Andreu Vall, Verena Haunschmid, Gerhard Widmer
Emotional aspects play an important part in our interaction with music.
3 code implementations • 3 Jul 2019 • Khaled Koutini, Hamid Eghbal-zadeh, Matthias Dorfer, Gerhard Widmer
To this end, we analyse the receptive field (RF) of these CNNs and demonstrate the importance of the RF to the generalization capability of the models.
no code implementations • 26 Jun 2019 • Stefan Balke, Matthias Dorfer, Luis Carvalho, Andreas Arzt, Gerhard Widmer
Quantitative and qualitative results on synthesized piano data indicate that this attention increases the robustness of the retrieval system by focusing on different parts of the input representation based on the tempo of the audio.
1 code implementation • 24 Jun 2019 • Federico Simonetta, Carlos Cancino-Chacón, Stavros Ntalampiras, Gerhard Widmer
The backbone of the method consists of a convolutional neural network (CNN) estimating the probability that each note in the score (more precisely: each pixel in a piano roll encoding of the score) belongs to the melody line.
no code implementations • 14 Jun 2019 • Zhengshan Shi, Carlos Cancino-Chacón, Gerhard Widmer
Musicians produce individualized, expressive performances by manipulating parameters such as dynamics, tempo and articulation.
no code implementations • 28 May 2019 • Verena Haunschmid, Shreyan Chowdhury, Gerhard Widmer
Current ML models for music emotion recognition, while generally working quite well, do not give meaningful or intuitive explanations for their predictions.
no code implementations • 12 Feb 2019 • Meinard Müller, Andreas Arzt, Stefan Balke, Matthias Dorfer, Gerhard Widmer
There has been a rapid growth of digitally available music data, including audio recordings, digitized images of sheet music, album covers and liner notes, and video clips.
1 code implementation • CVPR 2019 • Hamid Eghbal-zadeh, Werner Zellinger, Gerhard Widmer
Generative Adversarial Networks have surprising ability for generating sharp and realistic images, though they are known to suffer from the so-called mode collapse problem.
no code implementations • 15 Sep 2018 • Matthias Dorfer, Jan Hajič jr., Gerhard Widmer
Current models for audio--sheet music retrieval via multimodal embedding space learning use convolutional neural networks with a fixed-size window for the input audio.
no code implementations • 16 Aug 2018 • Filip Korzeniowski, Gerhard Widmer
Common temporal models for automatic chord recognition model chord changes on a frame-wise basis.
no code implementations • 16 Aug 2018 • Filip Korzeniowski, Gerhard Widmer
Finally, we investigate the model's performance on short excerpts of audio.
no code implementations • 16 Aug 2018 • Filip Korzeniowski, Gerhard Widmer
Chord recognition systems typically comprise an acoustic model that predicts chords for each audio frame, and a temporal model that casts these predictions into labelled chord segments.
1 code implementation • 17 Jul 2018 • Matthias Dorfer, Florian Henkel, Gerhard Widmer
Score following is the process of tracking a musical performance (audio) with respect to a known symbolic representation (a score).
no code implementations • 22 Jun 2018 • Stefan Lattner, Maarten Grachten, Gerhard Widmer
We show that the RGAE improves the state of the art for general connectionist sequence models in learning to predict monophonic melodies, and that ensembles of relative and absolute music processing models improve the results appreciably.
1 code implementation • 22 Jun 2018 • Hamid Eghbal-zadeh, Lukas Fischer, Niko Popitsch, Florian Kromp, Sabine Taschner-Mandl, Khaled Koutini, Teresa Gerber, Eva Bozsaky, Peter F. Ambros, Inge M. Ambros, Gerhard Widmer, Bernhard A. Moser
We show, that Deep SNP is capable of successfully predicting the presence or absence of a breakpoint in large genomic windows and outperforms state-of-the-art neural network models.
no code implementations • 21 Jun 2018 • Stefan Lattner, Maarten Grachten, Gerhard Widmer
Many music theoretical constructs (such as scale types, modes, cadences, and chord types) are defined in terms of pitch intervals---relative distances between pitches.
1 code implementation • 18 Jun 2018 • Richard Vogl, Gerhard Widmer, Peter Knees
In this work, convolutional and convolutional recurrent neural networks are trained to transcribe a wider range of drum instruments.
no code implementations • 29 May 2018 • Rainer Kelz, Gerhard Widmer
Within this conceptual framework, the transcription process can be described as the agent interacting with the instrument in the environment, and obtaining reward by playing along with what it hears.
1 code implementation • 28 May 2018 • Rainer Kelz, Gerhard Widmer
We measure the effect of small amounts of systematic and random label noise caused by slightly misaligned ground truth labels in a fine grained audio signal labeling task.
no code implementations • 5 Apr 2018 • Filip Korzeniowski, David R. W. Sears, Gerhard Widmer
We conduct a large-scale study of language models for chord prediction.
no code implementations • 10 Nov 2017 • Hamid Eghbal-zadeh, Matthias Dorfer, Gerhard Widmer
To tackle this problem, we propose Deep Within-Class Covariance Analysis (DWCCA), a deep neural network layer that significantly reduces the within-class covariance of a DNN's representation, improving performance on unseen test data from a shifted distribution.
no code implementations • 11 Sep 2017 • Carlos Cancino-Chacón, Maarten Grachten, David R. W. Sears, Gerhard Widmer
In this paper we present preliminary work examining the relationship between the formation of expectations and the realization of musical performances, paying particular attention to expressive tempo and dynamics.
no code implementations • 17 Aug 2017 • Stefan Lattner, Maarten Grachten, Gerhard Widmer
Music is usually highly structured and it is still an open question how to design models which can successfully learn to recognize and represent musical structure.
no code implementations • 6 Aug 2017 • Hamid Eghbal-zadeh, Gerhard Widmer
The central idea is to integrate a probabilistic model (a Gaussian Mixture Model, in our case) into the GAN framework which supports a new kind of loss function (based on likelihood rather than classification loss), and at the same time gives a meaningful measure of the quality of the outputs generated by the network.
no code implementations • 24 Jul 2017 • Hamid Eghbal-zadeh, Gerhard Widmer
We present a simple method for assessing the quality of generated images in Generative Adversarial Networks (GANs).
no code implementations • 20 Jun 2017 • Hamid Eghbal-zadeh, Bernhard Lehner, Matthias Dorfer, Gerhard Widmer
Finally, we propose a hybrid system for ASC using multi-channel i-vectors and CNNs by utilizing a score fusion technique.
no code implementations • 9 Jun 2017 • Filip Korzeniowski, Gerhard Widmer
We present an end-to-end system for musical key estimation, based on a convolutional neural network.
no code implementations • 1 Feb 2017 • Filip Korzeniowski, Gerhard Widmer
Chord recognition systems use temporal models to post-process frame-wise chord preditions from acoustic models.
3 code implementations • 15 Dec 2016 • Rainer Kelz, Matthias Dorfer, Filip Korzeniowski, Sebastian Böck, Andreas Arzt, Gerhard Widmer
In an attempt at exploring the limitations of simple approaches to the task of piano transcription (as usually defined in MIR), we conduct an in-depth analysis of neural network-based framewise transcription.
1 code implementation • 15 Dec 2016 • Filip Korzeniowski, Gerhard Widmer
We explore frame-level audio feature learning for chord recognition using artificial neural networks.
no code implementations • 15 Dec 2016 • Matthias Dorfer, Andreas Arzt, Gerhard Widmer
This paper addresses the matching of short music audio snippets to the corresponding pixel location in images of sheet music.
no code implementations • 15 Dec 2016 • Matthias Dorfer, Andreas Arzt, Gerhard Widmer
This paper demonstrates the feasibility of learning to retrieve short snippets of sheet music (images) when given a short query excerpt of music (audio) -- and vice versa --, without any symbolic representation of music or scores.
no code implementations • 15 Dec 2016 • Filip Korzeniowski, Gerhard Widmer
We show that the learned auditory system extracts musically interpretable features, and that the proposed chord recognition system achieves results on par or better than state-of-the-art algorithms.
no code implementations • 14 Dec 2016 • Stefan Lattner, Maarten Grachten, Gerhard Widmer
We introduce a method for imposing higher-level structure on generated, polyphonic music.
2 code implementations • 15 Nov 2015 • Matthias Dorfer, Rainer Kelz, Gerhard Widmer
The central idea of this paper is to put LDA on top of a deep neural network.