Search Results for author: Dorien Herremans

Found 46 papers, 22 papers with code

Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey

no code implementations27 Feb 2024 Dinh-Viet-Toan Le, Louis Bigo, Mikaela Keller, Dorien Herremans

Music has been frequently compared to language, as they share several similarities, including sequential representations of text and music.

Information Retrieval Music Generation +2

Mustango: Toward Controllable Text-to-Music Generation

1 code implementation14 Nov 2023 Jan Melechovsky, Zixun Guo, Deepanway Ghosal, Navonil Majumder, Dorien Herremans, Soujanya Poria

With recent advancements in text-to-audio and text-to-music based on latent diffusion models, the quality of generated content has been reaching new heights.

Data Augmentation Denoising +4

Constructing Time-Series Momentum Portfolios with Deep Multi-Task Learning

no code implementations8 Jun 2023 Joel Ong, Dorien Herremans

The performance of existing TSMOM strategies, however, relies not only on the quality of the momentum signal but also on the efficacy of the volatility estimator.

Multi-Task Learning Time Series

Jointist: Simultaneous Improvement of Multi-instrument Transcription and Music Source Separation via Joint Training

no code implementations1 Feb 2023 Kin Wai Cheuk, Keunwoo Choi, Qiuqiang Kong, Bochen Li, Minz Won, Ju-Chiang Wang, Yun-Ning Hung, Dorien Herremans

Jointist consists of an instrument recognition module that conditions the other two modules: a transcription module that outputs instrument-specific piano rolls, and a source separation module that utilizes instrument information and transcription results.

Chord Recognition Instrument Recognition +1

SNIPER Training: Variable Sparsity Rate Training For Text-To-Speech

no code implementations14 Nov 2022 Perry Lam, Huayun Zhang, Nancy F. Chen, Berrak Sisman, Dorien Herremans

Text-to-speech (TTS) models have achieved remarkable naturalness in recent years, yet like most deep neural models, they have more parameters than necessary.

Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder

1 code implementation7 Nov 2022 Jan Melechovsky, Ambuj Mehrish, Berrak Sisman, Dorien Herremans

Accent plays a significant role in speech communication, influencing understanding capabilities and also conveying a person's identity.

Speech Synthesis Text-To-Speech Synthesis

Forecasting Bitcoin volatility spikes from whale transactions and CryptoQuant data using Synthesizer Transformer models

1 code implementation6 Oct 2022 Dorien Herremans, Kah Wee Low

Our results show that the model outperforms existing state-of-the-art models when forecasting extreme volatility spikes for Bitcoin using CryptoQuant data as well as whale-alert tweets.

Explainable Artificial Intelligence (XAI) Management

PreBit -- A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bitcoin

1 code implementation30 May 2022 Yanzhao Zou, Dorien Herremans

In our hybrid model, we use sentence-level FinBERT embeddings, pretrained on financial lexicons, so as to capture the full contents of the tweets and feed it to the model in an understandable way.


Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses

1 code implementation19 Feb 2022 Phoebe Chua, Dimos Makris, Dorien Herremans, Gemma Roig, Kat Agres

In this paper we present MusicVideos (MuVi), a novel dataset for affective multimedia content analysis to study how the auditory and visual modalities contribute to the perceived emotion of media.

Descriptive Feature Importance +2

Conditional Drums Generation using Compound Word Representations

1 code implementation9 Feb 2022 Dimos Makris, Guo Zixun, Maximos Kaliakatsos-Papakostas, Dorien Herremans

The field of automatic music composition has seen great progress in recent years, specifically with the invention of transformer-based architectures.

AttendAffectNet–Emotion Prediction of Movie Viewers Using Multimodal Fusion with Self-Attention

1 code implementation Sensors 2021 Ha Thi Phuong Thao, B T Balamurali, Gemma Roig, Dorien Herremans

The models that use all visual, audio, and text features simultaneously as their inputs performed better than those using features extracted from each modality separately.

Representation Learning

aiSTROM -- A roadmap for developing a successful AI strategy

no code implementations25 Jun 2021 Dorien Herremans

This provides a unique and integrated approach that guides managers and lead developers through the various challenges in the implementation process.

Cultural Vocal Bursts Intensity Prediction

Deep Neural Network Based Respiratory Pathology Classification Using Cough Sounds

no code implementations23 Jun 2021 Balamurali B T, Hwan Ing Hee, Saumitra Kapoor, Oon Hoe Teoh, Sung Shin Teng, Khai Pin Lee, Dorien Herremans, Jer Ming Chen

The resulting trained model when trained for classifying two classes of coughs -- healthy or pathology (in general or belonging to a specific respiratory pathology), reaches accuracy exceeding 84\% when classifying cough to the label provided by the physicians' diagnosis.

Classification Sound Classification

Generating Lead Sheets with Affect: A Novel Conditional seq2seq Framework

1 code implementation27 Apr 2021 Dimos Makris, Kat R. Agres, Dorien Herremans

In this paper, we present a novel approach for calculating the valence (the positivity or negativity of the perceived emotion) of a chord progression within a lead sheet, using pre-defined mood tags proposed by music experts.

Machine Translation Music Generation +1

Underwater Acoustic Communication Receiver Using Deep Belief Network

no code implementations26 Feb 2021 Abigail Lee-Leon, Chau Yuen, Dorien Herremans

Our proposed receiver system comprises of DBN based de-noising and classification of the received signal.

AttendAffectNet: Self-Attention based Networks for Predicting Affective Responses from Movies

1 code implementation21 Oct 2020 Ha Thi Phuong Thao, Balamurali B. T., Dorien Herremans, Gemma Roig

In this work, we propose different variants of the self-attention based network for emotion prediction from movies, which we call AttendAffectNet.


The Effect of Spectrogram Reconstruction on Automatic Music Transcription: An Alternative Approach to Improve Transcription Accuracy

2 code implementations20 Oct 2020 Kin Wai Cheuk, Yin-Jyun Luo, Emmanouil Benetos, Dorien Herremans

We attempt to use only the pitch labels (together with spectrogram reconstruction loss) and explore how far this model can go without introducing supervised sub-tasks.

Music Transcription

Hit Song Prediction Based on Early Adopter Data and Audio Features

no code implementations16 Oct 2020 Dorien Herremans, Tom Bergmans

Billions of USD are invested in new artists and songs by the music industry every year.

A variational autoencoder for music generation controlled by tonal tension

1 code implementation13 Oct 2020 Rui Guo, Ivor Simpson, Thor Magnusson, Chris Kiefer, Dorien Herremans

Many of the music generation systems based on neural networks are fully autonomous and do not offer control over the generation process.

Sound Symbolic Computation Audio and Speech Processing

A dataset and classification model for Malay, Hindi, Tamil and Chinese music

no code implementations9 Sep 2020 Fajilatun Nahar, Kat Agres, Balamurali BT, Dorien Herremans

We use this new dataset to train different classification models to distinguish the origin of the music in terms of these ethnic groups.

Classification General Classification

Music FaderNets: Controllable Music Generation Based On High-Level Features via Low-Level Feature Modelling

1 code implementation29 Jul 2020 Hao Hao Tan, Dorien Herremans

Using arousal as an example of a high-level feature, we show that the "faders" of our model are disentangled and change linearly w. r. t.

Clustering Disentanglement +2

Generative Modelling for Controllable Audio Synthesis of Expressive Piano Performance

1 code implementation16 Jun 2020 Hao Hao Tan, Yin-Jyun Luo, Dorien Herremans

We present a controllable neural audio synthesizer based on Gaussian Mixture Variational Autoencoders (GM-VAE), which can generate realistic piano performances in the audio domain that closely follows temporal conditions of two essential style features for piano performances: articulation and dynamics.

Audio Synthesis

The impact of Audio input representations on neural network based music transcription

1 code implementation25 Jan 2020 Kin Wai Cheuk, Kat Agres, Dorien Herremans

This paper thoroughly analyses the effect of different input representations on polyphonic multi-instrument music transcription.

Sound Audio and Speech Processing

Midi Miner -- A Python library for tonal tension and track classification

1 code implementation3 Oct 2019 Rui Guo, Dorien Herremans, Thor Magnusson

We present a Python library, called Midi Miner, that can calculate tonal tension and classify different tracks.

General Classification Music Generation

Latent space representation for multi-target speaker detection and identification with a sparse dataset using Triplet neural networks

1 code implementation1 Oct 2019 Kin Wai Cheuk, Balamurali B. T., Gemma Roig, Dorien Herremans

When reducing the training data to only using the train set, our method results in 309 confusions for the Multi-target speaker identification task, which is 46% better than the baseline model.

Speaker Identification Speaker Recognition

Multimodal Deep Models for Predicting Affective Responses Evoked by Movies

1 code implementation16 Sep 2019 Ha Thi Phuong Thao, Dorien Herremans, Gemma Roig

Interestingly, we also observe that the optical flow is more informative than the RGB in videos, and overall, models using audio features are more accurate than those based on video features when making the final prediction of evoked emotions.

Optical Flow Estimation

Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders

no code implementations19 Jun 2019 Yin-Jyun Luo, Kat Agres, Dorien Herremans

Specifically, we use two separate encoders to learn distinct latent spaces for timbre and pitch, which form Gaussian mixture components representing instrument identity and pitch, respectively.

Towards robust audio spoofing detection: a detailed comparison of traditional and learned features

no code implementations28 May 2019 Balamurali BT, Kin Wah Edward Lin, Simon Lui, Jer-Ming Chen, Dorien Herremans

Finally, we evaluate the performance of our robust replay speaker detection system with a wide variety and different combinations of both extracted and machine learned audio features on the `out in the wild' ASVspoof 2017 dataset.

Speaker Verification

Dance Hit Song Prediction

no code implementations17 May 2019 Dorien herremans, David Martens, Kenneth Sörensen

Record companies invest billions of dollars in new talent around the globe each year.

General Classification Position +1

MorpheuS: generating structured music with constrained patterns and tension

1 code implementation12 Dec 2018 Dorien Herremans, Elaine Chew

MorpheuS' novel framework has the ability to generate polyphonic pieces with a given tension profile and long- and short-term repeated pattern structures.

Sound Audio and Speech Processing

A Functional Taxonomy of Music Generation Systems

no code implementations11 Dec 2018 Dorien Herremans, Ching-Hua Chuan, Elaine Chew

Digital advances have transformed the face of automatic music generation since its beginnings at the dawn of computing.

Music Generation

Singing Voice Separation Using a Deep Convolutional Neural Network Trained by Ideal Binary Mask and Cross Entropy

2 code implementations4 Dec 2018 Kin Wah Edward Lin, Balamurali B. T., Enyan Koh, Simon Lui, Dorien Herremans

We present a unique neural network approach inspired by a technique that has revolutionized the field of vision: pixel-wise image classification, which we combine with cross entropy loss and pretraining of the CNN as an autoencoder on singing voice spectrograms.

Data Augmentation General Classification +4

From Context to Concept: Exploring Semantic Relationships in Music with Word2Vec

no code implementations29 Nov 2018 Ching-Hua Chuan, Kat Agres, Dorien Herremans

In this newly learned vector space, a metric based on cosine distance is able to distinguish between functional chord relationships, as well as harmonic associations in the music.

Music Generation

Modeling Musical Context with Word2vec

no code implementations28 Jun 2017 Dorien Herremans, Ching-Hua Chuan

A visualization of the reduced vector space using t-distributed stochastic neighbor embedding shows that the resulting embedded vector space captures tonal relationships, even without any explicit information about the musical contents of the slices.

Proceedings of the First International Workshop on Deep Learning and Music

no code implementations27 Jun 2017 Dorien Herremans, Ching-Hua Chuan

Proceedings of the First International Workshop on Deep Learning and Music, joint with IJCNN, Anchorage, US, May 17-18, 2017

Cannot find the paper you are looking for? You can Submit a new open access paper.