Search Results for author: Dorien Herremans

Found 46 papers, 23 papers with code

Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey

1 code implementation • 27 Feb 2024 • Dinh-Viet-Toan Le, Louis Bigo, Mikaela Keller, Dorien Herremans

Music has been frequently compared to language, as they share several similarities, including sequential representations of text and music.

Information Retrieval Music Generation +2

Paper
Code

Mustango: Toward Controllable Text-to-Music Generation

1 code implementation • 14 Nov 2023 • Jan Melechovsky, Zixun Guo, Deepanway Ghosal, Navonil Majumder, Dorien Herremans, Soujanya Poria

Through extensive experiments, we show that the quality of the music generated by Mustango is state-of-the-art, and the controllability through music-specific text prompts greatly outperforms other models such as MusicGen and AudioLDM2.

Ranked #1 on Text-to-Music Generation on MusicBench

Data Augmentation Denoising +4

276

Paper
Code

Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model

1 code implementation • 2 Nov 2023 • Jaeyong Kang, Soujanya Poria, Dorien Herremans

These distinct features are then employed as guiding input to our music generation model.

Music Generation

121

Paper
Code

Constructing Time-Series Momentum Portfolios with Deep Multi-Task Learning

no code implementations • 8 Jun 2023 • Joel Ong, Dorien Herremans

The performance of existing TSMOM strategies, however, relies not only on the quality of the momentum signal but also on the efficacy of the volatility estimator.

Multi-Task Learning Time Series

Paper
Add Code

Jointist: Simultaneous Improvement of Multi-instrument Transcription and Music Source Separation via Joint Training

no code implementations • 1 Feb 2023 • Kin Wai Cheuk, Keunwoo Choi, Qiuqiang Kong, Bochen Li, Minz Won, Ju-Chiang Wang, Yun-Ning Hung, Dorien Herremans

Jointist consists of an instrument recognition module that conditions the other two modules: a transcription module that outputs instrument-specific piano rolls, and a source separation module that utilizes instrument information and transcription results.

Chord Recognition Instrument Recognition +1

Paper
Add Code

SNIPER Training: Variable Sparsity Rate Training For Text-To-Speech

no code implementations • 14 Nov 2022 • Perry Lam, Huayun Zhang, Nancy F. Chen, Berrak Sisman, Dorien Herremans

Text-to-speech (TTS) models have achieved remarkable naturalness in recent years, yet like most deep neural models, they have more parameters than necessary.

Paper
Add Code

Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder

1 code implementation • 7 Nov 2022 • Jan Melechovsky, Ambuj Mehrish, Berrak Sisman, Dorien Herremans

Accent plays a significant role in speech communication, influencing understanding capabilities and also conveying a person's identity.

Speech Synthesis Text-To-Speech Synthesis

Paper
Code

DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability

no code implementations • 11 Oct 2022 • Kin Wai Cheuk, Ryosuke Sawata, Toshimitsu Uesaka, Naoki Murata, Naoya Takahashi, Shusuke Takahashi, Dorien Herremans, Yuki Mitsufuji

In this paper we propose a novel generative approach, DiffRoll, to tackle automatic music transcription (AMT).

Music Transcription

Paper
Add Code

Forecasting Bitcoin volatility spikes from whale transactions and CryptoQuant data using Synthesizer Transformer models

1 code implementation • 6 Oct 2022 • Dorien Herremans, Kah Wee Low

Our results show that the model outperforms existing state-of-the-art models when forecasting extreme volatility spikes for Bitcoin using CryptoQuant data as well as whale-alert tweets.

Explainable Artificial Intelligence (XAI) Management

Paper
Code

Jointist: Joint Learning for Multi-instrument Transcription and Its Applications

no code implementations • 22 Jun 2022 • Kin Wai Cheuk, Keunwoo Choi, Qiuqiang Kong, Bochen Li, Minz Won, Amy Hung, Ju-Chiang Wang, Dorien Herremans

However, its novelty necessitates a new perspective on how to evaluate such a model.

Ranked #1 on Music Transcription on Slakh2100

Chord Recognition Instrument Recognition +1

Paper
Add Code

PreBit -- A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bitcoin

1 code implementation • 30 May 2022 • Yanzhao Zou, Dorien Herremans

In our hybrid model, we use sentence-level FinBERT embeddings, pretrained on financial lexicons, so as to capture the full contents of the tweets and feed it to the model in an understandable way.

Sentence

Paper
Code

HEAR: Holistic Evaluation of Audio Representations

3 code implementations • 6 Mar 2022 • Joseph Turian, Jordie Shier, Humair Raj Khan, Bhiksha Raj, Björn W. Schuller, Christian J. Steinmetz, Colin Malloy, George Tzanetakis, Gissel Velarde, Kirk McNally, Max Henry, Nicolas Pinto, Camille Noufi, Christian Clough, Dorien Herremans, Eduardo Fonseca, Jesse Engel, Justin Salamon, Philippe Esling, Pranay Manocha, Shinji Watanabe, Zeyu Jin, Yonatan Bisk

The aim of the HEAR benchmark is to develop a general-purpose audio representation that provides a strong basis for learning in a wide variety of tasks and scenarios.

Open-Ended Question Answering

Paper
Code

Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses

1 code implementation • 19 Feb 2022 • Phoebe Chua, Dimos Makris, Dorien Herremans, Gemma Roig, Kat Agres

In this paper we present MusicVideos (MuVi), a novel dataset for affective multimedia content analysis to study how the auditory and visual modalities contribute to the perceived emotion of media.

Descriptive Feature Importance +2

Paper
Code

MusIAC: An extensible generative framework for Music Infilling Applications with multi-level Control

no code implementations • 11 Feb 2022 • Rui Guo, Ivor Simpson, Chris Kiefer, Thor Magnusson, Dorien Herremans

We present a novel music generation framework for music infilling, with a user friendly interface.

Music Generation

Paper
Add Code

Conditional Drums Generation using Compound Word Representations

1 code implementation • 9 Feb 2022 • Dimos Makris, Guo Zixun, Maximos Kaliakatsos-Papakostas, Dorien Herremans

The field of automatic music composition has seen great progress in recent years, specifically with the invention of transformer-based architectures.

Paper
Code

AttendAffectNet–Emotion Prediction of Movie Viewers Using Multimodal Fusion with Self-Attention

1 code implementation • Sensors 2021 • Ha Thi Phuong Thao, B T Balamurali, Gemma Roig, Dorien Herremans

The models that use all visual, audio, and text features simultaneously as their inputs performed better than those using features extracted from each modality separately.

Representation Learning

Paper
Code

ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for Low-Resource Real-World Data

no code implementations • 11 Jul 2021 • Kin Wai Cheuk, Dorien Herremans, Li Su

Most of the current supervised automatic music transcription (AMT) models lack the ability to generalize.

Continual Learning Music Transcription

Paper
Add Code

aiSTROM -- A roadmap for developing a successful AI strategy

no code implementations • 25 Jun 2021 • Dorien Herremans

This provides a unique and integrated approach that guides managers and lead developers through the various challenges in the implementation process.

Cultural Vocal Bursts Intensity Prediction

Paper
Add Code

Deep Neural Network Based Respiratory Pathology Classification Using Cough Sounds

no code implementations • 23 Jun 2021 • Balamurali B T, Hwan Ing Hee, Saumitra Kapoor, Oon Hoe Teoh, Sung Shin Teng, Khai Pin Lee, Dorien Herremans, Jer Ming Chen

The resulting trained model when trained for classifying two classes of coughs -- healthy or pathology (in general or belonging to a specific respiratory pathology), reaches accuracy exceeding 84\% when classifying cough to the label provided by the physicians' diagnosis.

Classification Sound Classification

Paper
Add Code

Generating Lead Sheets with Affect: A Novel Conditional seq2seq Framework

1 code implementation • 27 Apr 2021 • Dimos Makris, Kat R. Agres, Dorien Herremans

In this paper, we present a novel approach for calculating the valence (the positivity or negativity of the perceived emotion) of a chord progression within a lead sheet, using pre-defined mood tags proposed by music experts.

Machine Translation Music Generation +1

Paper
Code

Underwater Acoustic Communication Receiver Using Deep Belief Network

no code implementations • 26 Feb 2021 • Abigail Lee-Leon, Chau Yuen, Dorien Herremans

Our proposed receiver system comprises of DBN based de-noising and classification of the received signal.

Paper
Add Code

AttendAffectNet: Self-Attention based Networks for Predicting Affective Responses from Movies

1 code implementation • 21 Oct 2020 • Ha Thi Phuong Thao, Balamurali B. T., Dorien Herremans, Gemma Roig

In this work, we propose different variants of the self-attention based network for emotion prediction from movies, which we call AttendAffectNet.

Relation

Paper
Code

The Effect of Spectrogram Reconstruction on Automatic Music Transcription: An Alternative Approach to Improve Transcription Accuracy

2 code implementations • 20 Oct 2020 • Kin Wai Cheuk, Yin-Jyun Luo, Emmanouil Benetos, Dorien Herremans

We attempt to use only the pitch labels (together with spectrogram reconstruction loss) and explore how far this model can go without introducing supervised sub-tasks.

Music Transcription

Paper
Code

Hit Song Prediction Based on Early Adopter Data and Audio Features

no code implementations • 16 Oct 2020 • Dorien Herremans, Tom Bergmans

Billions of USD are invested in new artists and songs by the music industry every year.

Paper
Add Code

A variational autoencoder for music generation controlled by tonal tension

1 code implementation • 13 Oct 2020 • Rui Guo, Ivor Simpson, Thor Magnusson, Chris Kiefer, Dorien Herremans

Many of the music generation systems based on neural networks are fully autonomous and do not offer control over the generation process.

Sound Symbolic Computation Audio and Speech Processing

Paper
Code

A dataset and classification model for Malay, Hindi, Tamil and Chinese music

no code implementations • 9 Sep 2020 • Fajilatun Nahar, Kat Agres, Balamurali BT, Dorien Herremans

We use this new dataset to train different classification models to distinguish the origin of the music in terms of these ethnic groups.

Classification General Classification

Paper
Add Code

Music FaderNets: Controllable Music Generation Based On High-Level Features via Low-Level Feature Modelling

1 code implementation • 29 Jul 2020 • Hao Hao Tan, Dorien Herremans

Using arousal as an example of a high-level feature, we show that the "faders" of our model are disentangled and change linearly w. r. t.

Clustering Disentanglement +2

Paper
Code

PerceptionGAN: Real-world Image Construction from Provided Text through Perceptual Understanding

no code implementations • 2 Jul 2020 • Kanish Garg, Ajeet Kumar Singh, Dorien Herremans, Brejesh lall

This initial image is then improved by conditioning on the text.

Descriptive Image Generation

Paper
Add Code

Acoustic prediction of flowrate: varying liquid jet stream onto a free surface

no code implementations • 16 Jun 2020 • Balamurali B. T, Edwin Jonathan Aslim, Yun Shu Lynn Ng, Tricia Li, Chuen Kuo, Jacob Shihang Chen, Dorien Herremans, Lay Guat Ng, Jer-Ming Chen

Information on liquid jet stream flow is crucial in many real world applications.

Paper
Add Code

Generative Modelling for Controllable Audio Synthesis of Expressive Piano Performance

1 code implementation • 16 Jun 2020 • Hao Hao Tan, Yin-Jyun Luo, Dorien Herremans

We present a controllable neural audio synthesizer based on Gaussian Mixture Variational Autoencoders (GM-VAE), which can generate realistic piano performances in the audio domain that closely follows temporal conditions of two essential style features for piano performances: articulation and dynamics.

Audio Synthesis

Paper
Code

The impact of Audio input representations on neural network based music transcription

1 code implementation • 25 Jan 2020 • Kin Wai Cheuk, Kat Agres, Dorien Herremans

This paper thoroughly analyses the effect of different input representations on polyphonic multi-instrument music transcription.

Sound Audio and Speech Processing

Paper
Code

nnAudio: An on-the-fly GPU Audio to Spectrogram Conversion Toolbox Using 1D Convolution Neural Networks

1 code implementation • 27 Dec 2019 • Kin Wai Cheuk, Hans Anderson, Kat Agres, Dorien Herremans

First, it takes a lot of hard disk space to store different frequency domain representations.

950

Paper
Code

Singing Voice Conversion with Disentangled Representations of Singer and Vocal Technique Using Variational Autoencoders

no code implementations • 3 Dec 2019 • Yin-Jyun Luo, Chin-Chen Hsu, Kat Agres, Dorien Herremans

We propose a flexible framework that deals with both singer conversion and singers vocal technique conversion.

Voice Conversion

Paper
Add Code

Midi Miner -- A Python library for tonal tension and track classification

1 code implementation • 3 Oct 2019 • Rui Guo, Dorien Herremans, Thor Magnusson

We present a Python library, called Midi Miner, that can calculate tonal tension and classify different tracks.

General Classification Music Generation

Paper
Code

Latent space representation for multi-target speaker detection and identification with a sparse dataset using Triplet neural networks

1 code implementation • 1 Oct 2019 • Kin Wai Cheuk, Balamurali B. T., Gemma Roig, Dorien Herremans

When reducing the training data to only using the train set, our method results in 309 confusions for the Multi-target speaker identification task, which is 46% better than the baseline model.

Speaker Identification Speaker Recognition

Paper
Code

Multimodal Deep Models for Predicting Affective Responses Evoked by Movies

1 code implementation • 16 Sep 2019 • Ha Thi Phuong Thao, Dorien Herremans, Gemma Roig

Interestingly, we also observe that the optical flow is more informative than the RGB in videos, and overall, models using audio features are more accurate than those based on video features when making the final prediction of evoked emotions.

Optical Flow Estimation

Paper
Code

Doppler Invariant Demodulation for Shallow Water Acoustic Communications Using Deep Belief Networks

no code implementations • 5 Sep 2019 • Abigail Lee-Leon, Chau Yuen, Dorien Herremans

The proposed method comprises of a ML based feature extraction method and classification technique.

General Classification

Paper
Add Code

Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders

no code implementations • 19 Jun 2019 • Yin-Jyun Luo, Kat Agres, Dorien Herremans

Specifically, we use two separate encoders to learn distinct latent spaces for timbre and pitch, which form Gaussian mixture components representing instrument identity and pitch, respectively.

Paper
Add Code

Towards robust audio spoofing detection: a detailed comparison of traditional and learned features

no code implementations • 28 May 2019 • Balamurali BT, Kin Wah Edward Lin, Simon Lui, Jer-Ming Chen, Dorien Herremans

Finally, we evaluate the performance of our robust replay speaker detection system with a wide variety and different combinations of both extracted and machine learned audio features on the `out in the wild' ASVspoof 2017 dataset.

Speaker Verification

Paper
Add Code

Dance Hit Song Prediction

no code implementations • 17 May 2019 • Dorien herremans, David Martens, Kenneth Sörensen

Record companies invest billions of dollars in new talent around the globe each year.

General Classification Position

Paper
Add Code

MorpheuS: generating structured music with constrained patterns and tension

1 code implementation • 12 Dec 2018 • Dorien Herremans, Elaine Chew

MorpheuS' novel framework has the ability to generate polyphonic pieces with a given tension profile and long- and short-term repeated pattern structures.

Sound Audio and Speech Processing

Paper
Code

A Functional Taxonomy of Music Generation Systems

no code implementations • 11 Dec 2018 • Dorien Herremans, Ching-Hua Chuan, Elaine Chew

Digital advances have transformed the face of automatic music generation since its beginnings at the dawn of computing.

Music Generation

Paper
Add Code

Singing Voice Separation Using a Deep Convolutional Neural Network Trained by Ideal Binary Mask and Cross Entropy

2 code implementations • 4 Dec 2018 • Kin Wah Edward Lin, Balamurali B. T., Enyan Koh, Simon Lui, Dorien Herremans

We present a unique neural network approach inspired by a technique that has revolutionized the field of vision: pixel-wise image classification, which we combine with cross entropy loss and pretraining of the CNN as an autoencoder on singing voice spectrograms.

Data Augmentation General Classification +4

Paper
Code

From Context to Concept: Exploring Semantic Relationships in Music with Word2Vec

no code implementations • 29 Nov 2018 • Ching-Hua Chuan, Kat Agres, Dorien Herremans

In this newly learned vector space, a metric based on cosine distance is able to distinguish between functional chord relationships, as well as harmonic associations in the music.

Music Generation

Paper
Add Code

Modeling Musical Context with Word2vec

no code implementations • 28 Jun 2017 • Dorien Herremans, Ching-Hua Chuan

A visualization of the reduced vector space using t-distributed stochastic neighbor embedding shows that the resulting embedded vector space captures tonal relationships, even without any explicit information about the musical contents of the slices.

Paper
Add Code

Proceedings of the First International Workshop on Deep Learning and Music

no code implementations • 27 Jun 2017 • Dorien Herremans, Ching-Hua Chuan

Proceedings of the First International Workshop on Deep Learning and Music, joint with IJCNN, Anchorage, US, May 17-18, 2017

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.