Search Results for author: Gerhard Widmer

Found 90 papers, 38 papers with code

Expressivity-aware Music Performance Retrieval using Mid-level Perceptual Features and Emotion Word Embeddings

no code implementations • 26 Jan 2024 • Shreyan Chowdhury, Gerhard Widmer

On the text side, we use emotion-enriched word embeddings (EWE) and on the audio side, we extract mid-level perceptual features instead of generic audio embeddings.

Retrieval Word Embeddings

Paper
Add Code

Are we describing the same sound? An analysis of word embedding spaces of expressive piano performance

1 code implementation • 31 Dec 2023 • Silvan David Peter, Shreyan Chowdhury, Carlos Eduardo Cancino-Chacón, Gerhard Widmer

Using a music research dataset of free text performance characterizations and a follow-up study sorting the annotations into clusters, we derive a ground truth for a domain-specific semantic similarity structure.

Information Retrieval Retrieval +2

Paper
Code

Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models

1 code implementation • 24 Oct 2023 • Florian Schmid, Khaled Koutini, Gerhard Widmer

Audio Spectrogram Transformers are excellent at exploiting large datasets, creating powerful pre-trained models that surpass CNNs when fine-tuned on downstream tasks.

Ranked #1 on Instrument Recognition on OpenMIC-2018 (using extra training data)

Audio Classification Audio Tagging +2

180

Paper
Code

Passage Summarization with Recurrent Models for Audio-Sheet Music Retrieval

no code implementations • 21 Sep 2023 • Luis Carvalho, Gerhard Widmer

Many applications of cross-modal music retrieval are related to connecting sheet music images to audio recordings.

Retrieval

Paper
Add Code

Self-Supervised Contrastive Learning for Robust Audio-Sheet Music Retrieval Systems

no code implementations • 21 Sep 2023 • Luis Carvalho, Tobias Washüttl, Gerhard Widmer

We then conclude by arguing for the potential of self-supervised contrastive learning for alleviating the annotated data scarcity in multi-modal music retrieval models.

Contrastive Learning Retrieval

Paper
Add Code

Towards Robust and Truly Large-Scale Audio-Sheet Music Retrieval

no code implementations • 21 Sep 2023 • Luis Carvalho, Gerhard Widmer

A range of applications of multi-modal music information retrieval is centred around the problem of connecting large collections of sheet music (images) to corresponding audio recordings, that is, identifying pairs of audio and score excerpts that refer to the same musical content.

Information Retrieval Music Information Retrieval +1

Paper
Add Code

Symbolic Music Representations for Classification Tasks: A Systematic Evaluation

1 code implementation • 5 Sep 2023 • huan zhang, Emmanouil Karystinaios, Simon Dixon, Gerhard Widmer, Carlos Eduardo Cancino-Chacón

Music Information Retrieval (MIR) has seen a recent surge in deep learning-based approaches, which often involve encoding symbolic music (i. e., music represented in terms of discrete note events) in an image-like or language like fashion.

Classification Information Retrieval +3

Paper
Code

Exploring Sampling Techniques for Generating Melodies with a Transformer Language Model

no code implementations • 18 Aug 2023 • Mathias Rose Bjare, Stefan Lattner, Gerhard Widmer

To accomplish this, we train a high-capacity transformer model on a vast collection of highly-structured Irish folk melodies and analyze the musical qualities of the samples generated using distribution truncation sampling techniques.

Language Modelling

Paper
Add Code

Advancing Natural-Language Based Audio Retrieval with PaSST and Large Audio-Caption Data Sets

1 code implementation • 8 Aug 2023 • Paul Primus, Khaled Koutini, Gerhard Widmer

This work presents a text-to-audio-retrieval system based on pre-trained text and spectrogram transformers.

Retrieval Text to Audio Retrieval

Paper
Code

Roman Numeral Analysis with Graph Neural Networks: Onset-wise Predictions from Note-wise Features

1 code implementation • 7 Jul 2023 • Emmanouil Karystinaios, Gerhard Widmer

Roman Numeral analysis is the important task of identifying chords and their functional context in pieces of tonal music.

Paper
Code

Predicting Music Hierarchies with a Graph-Based Neural Decoder

1 code implementation • 29 Jun 2023 • Francesco Foscarin, Daniel Harasim, Gerhard Widmer

This paper describes a data-driven framework to parse musical sequences into dependency trees, which are hierarchical structures used in music cognition research and music analysis.

Paper
Code

On Frequency-Wise Normalizations for Better Recording Device Generalization in Audio Spectrogram Transformers

no code implementations • 20 Jun 2023 • Paul Primus and, Gerhard Widmer

Based on this observation, we conjecture that suppressing recording device characteristics in the input spectrogram is the most effective.

Acoustic Scene Classification Scene Classification

Paper
Add Code

Domain Information Control at Inference Time for Acoustic Scene Classification

no code implementations • 13 Jun 2023 • Shahed Masoudian, Khaled Koutini, Markus Schedl, Gerhard Widmer, Navid Rekabsaz

In the Acoustic Scene Classification task (ASC), domain shift is mainly caused by different recording devices.

Acoustic Scene Classification Domain Generalization +1

Paper
Add Code

Discrete Diffusion Probabilistic Models for Symbolic Music Generation

1 code implementation • 16 May 2023 • Matthias Plasser, Silvan Peter, Gerhard Widmer

Denoising Diffusion Probabilistic Models (DDPMs) have made great strides in generating high-quality samples in both discrete and continuous domains.

Denoising Music Generation

Paper
Code

Device-Robust Acoustic Scene Classification via Impulse Response Augmentation

1 code implementation • 12 May 2023 • Tobias Morocutti, Florian Schmid, Khaled Koutini, Gerhard Widmer

However, we also show that DIR augmentation and Freq-MixStyle are complementary, achieving a new state-of-the-art performance on signals recorded by devices unseen during training.

Acoustic Scene Classification Audio Classification +1

Paper
Code

Musical Voice Separation as Link Prediction: Modeling a Musical Perception Task as a Multi-Trajectory Tracking Problem

1 code implementation • 28 Apr 2023 • Emmanouil Karystinaios, Francesco Foscarin, Gerhard Widmer

Our approach builds a graph from a musical piece, by creating one node for every note, and separates the melodic trajectories by predicting a link between two notes if they are consecutive in the same voice/stream.

Link Prediction

Paper
Code

Probabilistic Modelling of Signal Mixtures with Differentiable Dictionaries

no code implementations • 28 Nov 2022 • Lukáš Samuel Marták, Rainer Kelz, Gerhard Widmer

We introduce a novel way to incorporate prior information into (semi-) supervised non-negative matrix factorization, which we call differentiable dictionary search.

Paper
Add Code

Differentiable Dictionary Search: Integrating Linear Mixing with Deep Non-Linear Modelling for Audio Source Separation

no code implementations • 28 Nov 2022 • Lukáš Samuel Marták, Rainer Kelz, Gerhard Widmer

This paper describes several improvements to a new method for signal decomposition that we recently formulated under the name of Differentiable Dictionary Search (DDS).

Audio Source Separation

Paper
Add Code

Learning General Audio Representations with Large-Scale Training of Patchout Audio Transformers

1 code implementation • 25 Nov 2022 • Khaled Koutini, Shahed Masoudian, Florian Schmid, Hamid Eghbal-zadeh, Jan Schlüter, Gerhard Widmer

Furthermore, we will show that transformers trained on Audioset can be extremely effective representation extractors for a wide range of downstream tasks.

Paper
Code

Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation

2 code implementations • 9 Nov 2022 • Florian Schmid, Khaled Koutini, Gerhard Widmer

We provide models of different complexity levels, scaling from low-complexity models up to a new state-of-the-art performance of . 483 mAP on AudioSet.

Ranked #2 on Audio Tagging on AudioSet (using extra training data)

Audio Classification Audio Tagging +2

180

Paper
Code

Cadence Detection in Symbolic Classical Music using Graph Neural Networks

no code implementations • 31 Aug 2022 • Emmanouil Karystinaios, Gerhard Widmer

In this work, we present a graph representation of symbolic scores as an intermediate means to solve the cadence detection task.

Key Detection Node Classification

Paper
Add Code

Concept-Based Techniques for "Musicologist-friendly" Explanations in a Deep Music Classifier

no code implementations • 26 Aug 2022 • Francesco Foscarin, Katharina Hoedt, Verena Praher, Arthur Flexer, Gerhard Widmer

Current approaches for explaining deep learning systems applied to musical data provide results in a low-level feature space, e. g., by highlighting potentially relevant time-frequency bins in a spectrogram or time-pitch bins in a piano roll.

Paper
Add Code

Improved Zero-Shot Audio Tagging & Classification with Patchout Spectrogram Transformers

no code implementations • 24 Aug 2022 • Paul Primus, Gerhard Widmer

Standard machine learning models for tagging and classifying acoustic signals cannot handle classes that were not seen during training.

Audio Tagging Classification +2

Paper
Add Code

Improving Natural-Language-based Audio Retrieval with Transfer Learning and Audio & Text Augmentations

no code implementations • 24 Aug 2022 • Paul Primus, Gerhard Widmer

The absence of large labeled datasets remains a significant challenge in many application areas of deep learning.

Data Augmentation Natural Language Queries +2

Paper
Add Code

Defending a Music Recommender Against Hubness-Based Adversarial Attacks

1 code implementation • 24 May 2022 • Katharina Hoedt, Arthur Flexer, Gerhard Widmer

Adversarial attacks can drastically degrade performance of recommenders and other machine learning systems, resulting in an increased demand for defence mechanisms.

Paper
Code

Fully Automatic Page Turning on Real Scores

1 code implementation • 12 Nov 2021 • Florian Henkel, Stephanie Schwaiger, Gerhard Widmer

We present a prototype of an automatic page turning system that works directly on real scores, i. e., sheet images, without any symbolic representation.

Position

Paper
Code

Efficient Training of Audio Transformers with Patchout

2 code implementations • 11 Oct 2021 • Khaled Koutini, Jan Schlüter, Hamid Eghbal-zadeh, Gerhard Widmer

However, one of the main shortcomings of transformer models, compared to the well-established CNNs, is the computational complexity.

Ranked #3 on Audio Classification on FSD50K (using extra training data)

Acoustic Scene Classification Audio Classification +2

278

Paper
Code

Improving Real-time Score Following in Opera by Combining Music with Lyrics Tracking

no code implementations • NLP4MusA 2021 • Charles Brazier, Gerhard Widmer

Fully automatic opera tracking is challenging because of the acoustic complexity of the genre, combining musical and linguistic information (singing, speech) in complex ways.

Paper
Add Code

Nonlinear Denoising, Linear Demixing

no code implementations • NeurIPS Workshop ICBINB 2021 • Rainer Kelz, Gerhard Widmer

We cast the combinatorial problem of polyphonic piano transcription as a two stage process.

Denoising

Paper
Add Code

On-Line Audio-to-Lyrics Alignment Based on a Reference Performance

no code implementations • 30 Jul 2021 • Charles Brazier, Gerhard Widmer

Audio-to-lyrics alignment has become an increasingly active research task in MIR, supported by the emergence of several open-source datasets of audio recordings with word-level lyrics annotations.

Specificity

Paper
Add Code

On the Veracity of Local, Model-agnostic Explanations in Audio Classification: Targeted Investigations with Adversarial Examples

1 code implementation • 19 Jul 2021 • Verena Praher, Katharina Prinz, Arthur Flexer, Gerhard Widmer

The basic idea is to identify a small set of human-understandable features of the classified example that are most influential on the classifier's prediction.

Audio Classification

Paper
Code

Over-Parameterization and Generalization in Audio Classification

no code implementations • 19 Jul 2021 • Khaled Koutini, Hamid Eghbal-zadeh, Florian Henkel, Jan Schlüter, Gerhard Widmer

Convolutional Neural Networks (CNNs) have been dominating classification tasks in various domains, such as machine vision, machine listening, and natural language processing.

Acoustic Scene Classification Audio Classification +1

Paper
Add Code

Tracing Back Music Emotion Predictions to Sound Sources and Intuitive Perceptual Qualities

2 code implementations • 14 Jun 2021 • Shreyan Chowdhury, Verena Praher, Gerhard Widmer

In previous work, we have shown how to derive explanations of model predictions in terms of spectrogram image segments that connect to the high-level emotion prediction via a layer of easily interpretable perceptual features.

Emotion Recognition Information Retrieval +3

Paper
Code

Receptive Field Regularization Techniques for Audio Classification and Tagging with Deep Convolutional Neural Networks

1 code implementation • 26 May 2021 • Khaled Koutini, Hamid Eghbal-zadeh, Gerhard Widmer

As state-of-the-art CNN architectures-in computer vision and other domains-tend to go deeper in terms of number of layers, their RF size increases and therefore they degrade in performance in several audio classification and tagging tasks.

Acoustic Scene Classification Audio Classification +2

Paper
Code

Exploiting Temporal Dependencies for Cross-Modal Music Piece Identification

no code implementations • 26 May 2021 • Luis Carvalho, Gerhard Widmer

This paper addresses the problem of cross-modal musical piece identification and retrieval: finding the appropriate recording(s) from a database given a sheet music query, and vice versa, working directly with audio and scanned sheet music images.

Retrieval

Paper
Add Code

Handling Structural Mismatches in Real-time Opera Tracking

no code implementations • 18 May 2021 • Charles Brazier, Gerhard Widmer

Algorithms for reliable real-time score following in live opera promise a lot of useful applications such as automatic subtitles display, or real-time video cutting in live streaming.

Paper
Add Code

Multi-modal Conditional Bounding Box Regression for Music Score Following

1 code implementation • 10 May 2021 • Florian Henkel, Gerhard Widmer

This paper addresses the problem of sheet-image-based on-line audio-to-score alignment also known as score following.

Data Augmentation object-detection +2

Paper
Code

Learning to Infer Unseen Contexts in Causal Contextual Reinforcement Learning

no code implementations • ICLR Workshop SSL-RL 2021 • Hamid Eghbal-zadeh, Florian Henkel, Gerhard Widmer

In Contextual Reinforcement Learning (CRL), a change in the context variable can cause a change in the distribution of the states.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Towards Explaining Expressive Qualities in Piano Recordings: Transfer of Explanatory Features via Acoustic Domain Adaptation

no code implementations • 26 Feb 2021 • Shreyan Chowdhury, Gerhard Widmer

Emotion and expressivity in music have been topics of considerable interest in the field of music information retrieval.

Information Retrieval Music Information Retrieval +2

Paper
Add Code

Anomalous Sound Detection as a Simple Binary Classification Problem with Careful Selection of Proxy Outlier Examples

1 code implementation • 5 Nov 2020 • Paul Primus, Verena Haunschmid, Patrick Praher, Gerhard Widmer

If no data with similar sounds and matching recording conditions is available, data sets with a larger diversity in these two dimensions are preferable.

Anomaly Detection Binary Classification +1

Paper
Code

Low-Complexity Models for Acoustic Scene Classification Based on Receptive Field Regularization and Frequency Damping

1 code implementation • 5 Nov 2020 • Khaled Koutini, Florian Henkel, Hamid Eghbal-zadeh, Gerhard Widmer

Deep Neural Networks are known to be very demanding in terms of computing and memory requirements.

Acoustic Scene Classification Scene Classification

Paper
Code

Towards Musically Meaningful Explanations Using Source Separation

1 code implementation • 4 Sep 2020 • Verena Haunschmid, Ethan Manilow, Gerhard Widmer

Prior work on explainable models in MIR has generally used image processing tools to produce explanations for DNN predictions, but these are not necessarily musically meaningful, or can be listened to (which, arguably, is important in music).

Explainable Models Image Segmentation +4

Paper
Code

The Impact of Label Noise on a Music Tagger

no code implementations • 14 Aug 2020 • Katharina Prinz, Arthur Flexer, Gerhard Widmer

We explore how much can be learned from noisy labels in audio music tagging.

Music Tagging

Paper
Add Code

On the Characterization of Expressive Performance in Classical Music: First Results of the Con Espressione Game

no code implementations • 5 Aug 2020 • Carlos Cancino-Chacón, Silvan Peter, Shreyan Chowdhury, Anna Aljanaki, Gerhard Widmer

In this paper, we offer a first account of this new data resource for expressive performance research, and provide an exploratory analysis, addressing three main questions: (1) how similarly do different listeners describe a performance of a piece?

Descriptive

Paper
Add Code

audioLIME: Listenable Explanations Using Source Separation

2 code implementations • 2 Aug 2020 • Verena Haunschmid, Ethan Manilow, Gerhard Widmer

Deep neural networks (DNNs) are successfully applied in a wide variety of music information retrieval (MIR) tasks but their predictions are usually not interpretable.

Information Retrieval Music Information Retrieval +2

Paper
Code

Receptive-Field Regularized CNNs for Music Classification and Tagging

1 code implementation • 27 Jul 2020 • Khaled Koutini, Hamid Eghbal-zadeh, Verena Haunschmid, Paul Primus, Shreyan Chowdhury, Gerhard Widmer

However, the MIR field is still dominated by the classical VGG-based CNN architecture variants, often in combination with more complex modules such as attention, and/or techniques such as pre-training on large datasets.

Classification General Classification +4

Paper
Code

Learning to Read and Follow Music in Complete Score Sheet Images

1 code implementation • 21 Jul 2020 • Florian Henkel, Rainer Kelz, Gerhard Widmer

This paper addresses the task of score following in sheet music given as unprocessed images.

Position

Paper
Code

On Data Augmentation and Adversarial Risk: An Empirical Analysis

no code implementations • 6 Jul 2020 • Hamid Eghbal-zadeh, Khaled Koutini, Paul Primus, Verena Haunschmid, Michal Lewandowski, Werner Zellinger, Bernhard A. Moser, Gerhard Widmer

Data augmentation techniques have become standard practice in deep learning, as it has been shown to greatly improve the generalisation abilities of models.

Adversarial Attack Data Augmentation

Paper
Add Code

Beneath (or beyond) the surface: Discovering voice-leading patterns with skip-grams

no code implementations • 27 Jun 2020 • David R. W. Sears, Gerhard Widmer

Recurrent voice-leading patterns like the Mi-Re-Do compound cadence (MRDCC) rarely appear on the musical surface in complex polyphonic textures, so finding these patterns using computational methods remains a tremendous challenge.

Paper
Add Code

Towards Reliable Real-time Opera Tracking: Combining Alignment with Audio Event Detectors to Increase Robustness

no code implementations • 19 Jun 2020 • Charles Brazier, Gerhard Widmer

Recent advances in real-time music score following have made it possible for machines to automatically track highly complex polyphonic music, including full orchestra performances.

Dynamic Time Warping

Paper
Add Code

Emotion and Theme Recognition in Music with Frequency-Aware RF-Regularized CNNs

1 code implementation • 28 Oct 2019 • Khaled Koutini, Shreyan Chowdhury, Verena Haunschmid, Hamid Eghbal-zadeh, Gerhard Widmer

We present CP-JKU submission to MediaEval 2019; a Receptive Field-(RF)-regularized and Frequency-Aware CNN approach for tagging music with emotion/mood labels.

Acoustic Scene Classification Scene Classification

Paper
Code

A Study of Annotation and Alignment Accuracy for Performance Comparison in Complex Orchestral Music

no code implementations • 16 Oct 2019 • Thassilo Gadermaier, Gerhard Widmer

We then carry out a systematic evaluation of different audio features for audio-to-audio alignment, quantifying the degree of alignment accuracy that can be achieved, and relate this to the results from the annotation study.

Paper
Add Code

Audio-Conditioned U-Net for Position Estimation in Full Sheet Images

1 code implementation • 16 Oct 2019 • Florian Henkel, Rainer Kelz, Gerhard Widmer

The goal of score following is to track a musical performance, usually in the form of audio, in a corresponding score representation.

Multimodal Deep Learning Position

Paper
Code

Receptive-field-regularized CNN variants for acoustic scene classification

2 code implementations • 5 Sep 2019 • Khaled Koutini, Hamid Eghbal-zadeh, Gerhard Widmer

One side effect of restricting the RF of CNNs is that more frequency information is lost.

Acoustic Scene Classification Classification +2

Paper
Code

Exploiting Parallel Audio Recordings to Enforce Device Invariance in CNN-based Acoustic Scene Classification

1 code implementation • 4 Sep 2019 • Paul Primus, Hamid Eghbal-zadeh, David Eitelsebner, Khaled Koutini, Andreas Arzt, Gerhard Widmer

Distribution mismatches between the data seen at training and at application time remain a major challenge in all application areas of machine learning.

Acoustic Scene Classification BIG-bench Machine Learning +3

Paper
Code

Towards Explainable Music Emotion Recognition: The Route via Mid-level Features

no code implementations • 8 Jul 2019 • Shreyan Chowdhury, Andreu Vall, Verena Haunschmid, Gerhard Widmer

Emotional aspects play an important part in our interaction with music.

Emotion Recognition Music Emotion Recognition

Paper
Add Code

The Receptive Field as a Regularizer in Deep Convolutional Neural Networks for Acoustic Scene Classification

3 code implementations • 3 Jul 2019 • Khaled Koutini, Hamid Eghbal-zadeh, Matthias Dorfer, Gerhard Widmer

To this end, we analyse the receptive field (RF) of these CNNs and demonstrate the importance of the RF to the generalization capability of the models.

Acoustic Scene Classification General Classification +1

Paper
Code

Learning Soft-Attention Models for Tempo-invariant Audio-Sheet Music Retrieval

no code implementations • 26 Jun 2019 • Stefan Balke, Matthias Dorfer, Luis Carvalho, Andreas Arzt, Gerhard Widmer

Quantitative and qualitative results on synthesized piano data indicate that this attention increases the robustness of the retrieval system by focusing on different parts of the input representation based on the tempo of the audio.

Cross-Modal Retrieval Retrieval

Paper
Add Code

A Convolutional Approach to Melody Line Identification in Symbolic Scores

1 code implementation • 24 Jun 2019 • Federico Simonetta, Carlos Cancino-Chacón, Stavros Ntalampiras, Gerhard Widmer

The backbone of the method consists of a convolutional neural network (CNN) estimating the probability that each note in the score (more precisely: each pixel in a piano roll encoding of the score) belongs to the melody line.

Information Retrieval Music Information Retrieval +1

Paper
Code

User Curated Shaping of Expressive Performances

no code implementations • 14 Jun 2019 • Zhengshan Shi, Carlos Cancino-Chacón, Gerhard Widmer

Musicians produce individualized, expressive performances by manipulating parameters such as dynamics, tempo and articulation.

Paper
Add Code

Two-level Explanations in Music Emotion Recognition

no code implementations • 28 May 2019 • Verena Haunschmid, Shreyan Chowdhury, Gerhard Widmer

Current ML models for music emotion recognition, while generally working quite well, do not give meaningful or intuitive explanations for their predictions.

Emotion Recognition Music Emotion Recognition +1

Paper
Add Code

Cross-Modal Music Retrieval and Applications: An Overview of Key Methodologies

no code implementations • 12 Feb 2019 • Meinard Müller, Andreas Arzt, Stefan Balke, Matthias Dorfer, Gerhard Widmer

There has been a rapid growth of digitally available music data, including audio recordings, digitized images of sheet music, album covers and liner notes, and video clips.

Cross-Modal Retrieval Retrieval

Paper
Add Code

Mixture Density Generative Adversarial Networks

1 code implementation • CVPR 2019 • Hamid Eghbal-zadeh, Werner Zellinger, Gerhard Widmer

Generative Adversarial Networks have surprising ability for generating sharp and realistic images, though they are known to suffer from the so-called mode collapse problem.

Paper
Code

Attention as a Perspective for Learning Tempo-invariant Audio Queries

no code implementations • 15 Sep 2018 • Matthias Dorfer, Jan Hajič jr., Gerhard Widmer

Current models for audio--sheet music retrieval via multimodal embedding space learning use convolutional neural networks with a fixed-size window for the input audio.

Retrieval

Paper
Add Code

Automatic Chord Recognition with Higher-Order Harmonic Language Modelling

no code implementations • 16 Aug 2018 • Filip Korzeniowski, Gerhard Widmer

Common temporal models for automatic chord recognition model chord changes on a frame-wise basis.

Chord Recognition Language Modelling

Paper
Add Code

Genre-Agnostic Key Classification With Convolutional Neural Networks

no code implementations • 16 Aug 2018 • Filip Korzeniowski, Gerhard Widmer

Finally, we investigate the model's performance on short excerpts of audio.

Classification General Classification

Paper
Add Code

Improved Chord Recognition by Combining Duration and Harmonic Language Models

no code implementations • 16 Aug 2018 • Filip Korzeniowski, Gerhard Widmer

Chord recognition systems typically comprise an acoustic model that predicts chords for each audio frame, and a temporal model that casts these predictions into labelled chord segments.

Chord Recognition Language Modelling

Paper
Add Code

Learning to Listen, Read, and Follow: Score Following as a Reinforcement Learning Game

1 code implementation • 17 Jul 2018 • Matthias Dorfer, Florian Henkel, Gerhard Widmer

Score following is the process of tracking a musical performance (audio) with respect to a known symbolic representation (a score).

Decision Making reinforcement-learning +1

Paper
Code

Deep SNP: An End-to-end Deep Neural Network with Attention-based Localization for Break-point Detection in SNP Array Genomic data

1 code implementation • 22 Jun 2018 • Hamid Eghbal-zadeh, Lukas Fischer, Niko Popitsch, Florian Kromp, Sabine Taschner-Mandl, Khaled Koutini, Teresa Gerber, Eva Bozsaky, Peter F. Ambros, Inge M. Ambros, Gerhard Widmer, Bernhard A. Moser

We show, that Deep SNP is capable of successfully predicting the presence or absence of a breakpoint in large genomic windows and outperforms state-of-the-art neural network models.

Paper
Code

A Predictive Model for Music Based on Learned Interval Representations

no code implementations • 22 Jun 2018 • Stefan Lattner, Maarten Grachten, Gerhard Widmer

We show that the RGAE improves the state of the art for general connectionist sequence models in learning to predict monophonic melodies, and that ensembles of relative and absolute music processing models improve the results appreciably.

Paper
Add Code

Learning Transposition-Invariant Interval Features from Symbolic Music and Audio

no code implementations • 21 Jun 2018 • Stefan Lattner, Maarten Grachten, Gerhard Widmer

Many music theoretical constructs (such as scale types, modes, cadences, and chord types) are defined in terms of pitch intervals---relative distances between pitches.

Paper
Add Code

Towards multi-instrument drum transcription

1 code implementation • 18 Jun 2018 • Richard Vogl, Gerhard Widmer, Peter Knees

In this work, convolutional and convolutional recurrent neural networks are trained to transcribe a wider range of drum instruments.

Drum Transcription Music Transcription

114

Paper
Code

Learning to Transcribe by Ear

no code implementations • 29 May 2018 • Rainer Kelz, Gerhard Widmer

Within this conceptual framework, the transcription process can be described as the agent interacting with the instrument in the environment, and obtaining reward by playing along with what it hears.

Paper
Add Code

Investigating Label Noise Sensitivity of Convolutional Neural Networks for Fine Grained Audio Signal Labelling

1 code implementation • 28 May 2018 • Rainer Kelz, Gerhard Widmer

We measure the effect of small amounts of systematic and random label noise caused by slightly misaligned ground truth labels in a fine grained audio signal labeling task.

Paper
Code

A Large-Scale Study of Language Models for Chord Prediction

no code implementations • 5 Apr 2018 • Filip Korzeniowski, David R. W. Sears, Gerhard Widmer

We conduct a large-scale study of language models for chord prediction.

Chord Recognition

Paper
Add Code

Deep Within-Class Covariance Analysis for Robust Audio Representation Learning

no code implementations • 10 Nov 2017 • Hamid Eghbal-zadeh, Matthias Dorfer, Gerhard Widmer

To tackle this problem, we propose Deep Within-Class Covariance Analysis (DWCCA), a deep neural network layer that significantly reduces the within-class covariance of a DNN's representation, improving performance on unseen test data from a shifted distribution.

Acoustic Scene Classification Classification +3

Paper
Add Code

What were you expecting? Using Expectancy Features to Predict Expressive Performances of Classical Piano Music

no code implementations • 11 Sep 2017 • Carlos Cancino-Chacón, Maarten Grachten, David R. W. Sears, Gerhard Widmer

In this paper we present preliminary work examining the relationship between the formation of expectations and the realization of musical performances, paying particular attention to expressive tempo and dynamics.

Paper
Add Code

Learning Musical Relations using Gated Autoencoders

no code implementations • 17 Aug 2017 • Stefan Lattner, Maarten Grachten, Gerhard Widmer

Music is usually highly structured and it is still an open question how to design models which can successfully learn to recognize and represent musical structure.

Open-Ended Question Answering

Paper
Add Code

Probabilistic Generative Adversarial Networks

no code implementations • 6 Aug 2017 • Hamid Eghbal-zadeh, Gerhard Widmer

The central idea is to integrate a probabilistic model (a Gaussian Mixture Model, in our case) into the GAN framework which supports a new kind of loss function (based on likelihood rather than classification loss), and at the same time gives a meaningful measure of the quality of the outputs generated by the network.

Generative Adversarial Network

Paper
Add Code

Likelihood Estimation for Generative Adversarial Networks

no code implementations • 24 Jul 2017 • Hamid Eghbal-zadeh, Gerhard Widmer

We present a simple method for assessing the quality of generated images in Generative Adversarial Networks (GANs).

Paper
Add Code

A Hybrid Approach with Multi-channel I-Vectors and Convolutional Neural Networks for Acoustic Scene Classification

no code implementations • 20 Jun 2017 • Hamid Eghbal-zadeh, Bernhard Lehner, Matthias Dorfer, Gerhard Widmer

Finally, we propose a hybrid system for ASC using multi-channel i-vectors and CNNs by utilizing a score fusion technique.

Acoustic Scene Classification General Classification +1

Paper
Add Code

End-to-End Musical Key Estimation Using a Convolutional Neural Network

no code implementations • 9 Jun 2017 • Filip Korzeniowski, Gerhard Widmer

We present an end-to-end system for musical key estimation, based on a convolutional neural network.

Paper
Add Code

On the Futility of Learning Complex Frame-Level Language Models for Chord Recognition

no code implementations • 1 Feb 2017 • Filip Korzeniowski, Gerhard Widmer

Chord recognition systems use temporal models to post-process frame-wise chord preditions from acoustic models.

Chord Recognition

Paper
Add Code

Towards Score Following in Sheet Music Images

no code implementations • 15 Dec 2016 • Matthias Dorfer, Andreas Arzt, Gerhard Widmer

This paper addresses the matching of short music audio snippets to the corresponding pixel location in images of sheet music.

Position

Paper
Add Code

A Fully Convolutional Deep Auditory Model for Musical Chord Recognition

no code implementations • 15 Dec 2016 • Filip Korzeniowski, Gerhard Widmer

We show that the learned auditory system extracts musically interpretable features, and that the proposed chord recognition system achieves results on par or better than state-of-the-art algorithms.

Chord Recognition

Paper
Add Code

Towards End-to-End Audio-Sheet-Music Retrieval

no code implementations • 15 Dec 2016 • Matthias Dorfer, Andreas Arzt, Gerhard Widmer

This paper demonstrates the feasibility of learning to retrieve short snippets of sheet music (images) when given a short query excerpt of music (audio) -- and vice versa --, without any symbolic representation of music or scores.

Retrieval

Paper
Add Code

On the Potential of Simple Framewise Approaches to Piano Transcription

2 code implementations • 15 Dec 2016 • Rainer Kelz, Matthias Dorfer, Filip Korzeniowski, Sebastian Böck, Andreas Arzt, Gerhard Widmer

In an attempt at exploring the limitations of simple approaches to the task of piano transcription (as usually defined in MIR), we conduct an in-depth analysis of neural network-based framewise transcription.

213

Paper
Code