Search Results for author: Yi-Hsuan Yang

Found 61 papers, 43 papers with code

Local Periodicity-Based Beat Tracking for Expressive Classical Piano Music

1 code implementation20 Aug 2023 Ching-Yu Chiu, Meinard Müller, Matthew E. P. Davies, Alvin Wen-Yu Su, Yi-Hsuan Yang

To model the periodicity of beats, state-of-the-art beat tracking systems use "post-processing trackers" (PPTs) that rely on several empirically determined global assumptions for tempo transition, which work well for music with a steady tempo.

An Analysis Method for Metric-Level Switching in Beat Tracking

1 code implementation13 Oct 2022 Ching-Yu Chiu, Meinard Müller, Matthew E. P. Davies, Alvin Wen-Yu Su, Yi-Hsuan Yang

For expressive music, the tempo may change over time, posing challenges to tracking the beats by an automatic model.

JukeDrummer: Conditional Beat-aware Audio-domain Drum Accompaniment Generation via Transformer VQ-VAE

1 code implementation12 Oct 2022 Yueh-Kao Wu, Ching-Yu Chiu, Yi-Hsuan Yang

Instead of generating the drum track directly as waveforms, we use a separate VQ-VAE to encode the mel-spectrogram of a drum track into another set of discrete codes, and train the Transformer to predict the sequence of drum-related discrete codes.


Melody Infilling with User-Provided Structural Context

1 code implementation6 Oct 2022 Chih-Pin Tan, Alvin W. Y. Su, Yi-Hsuan Yang

This paper proposes a novel Transformer-based model for music score infilling, to generate a music passage that fills in the gap between given past and future contexts.

Compose & Embellish: Well-Structured Piano Performance Generation via A Two-Stage Approach

1 code implementation17 Sep 2022 Shih-Lun Wu, Yi-Hsuan Yang

Even with strong sequence models like Transformers, generating expressive piano performances with long-range musical structures remains challenging.

Music Score Expansion with Variable-Length Infilling

no code implementations11 Nov 2021 Chih-Pin Tan, Chin-Jui Chang, Alvin W. Y. Su, Yi-Hsuan Yang

In this paper, we investigate using the variable-length infilling (VLI) model, which is originally proposed to infill missing segments, to "prolong" existing musical segments at musical boundaries.

Theme Transformer: Symbolic Music Generation with Theme-Conditioned Transformer

1 code implementation7 Nov 2021 Yi-Jen Shih, Shih-Lun Wu, Frank Zalkow, Meinard Müller, Yi-Hsuan Yang

To condition the generation process of such a model with a user-specified sequence, a popular approach is to take that conditioning sequence as a priming sequence and ask a Transformer decoder to generate a continuation.

Music Generation Representation Learning Sound Multimedia Audio and Speech Processing

Learning To Generate Piano Music With Sustain Pedals

1 code implementation1 Nov 2021 Joann Ching, Yi-Hsuan Yang

Recent years have witnessed a growing interest in research related to the detection of piano pedals from audio signals in the music information retrieval community.

Decoder Information Retrieval +2

Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Features

2 code implementations17 Oct 2021 Wei-Han Hsu, Bo-Yu Chen, Yi-Hsuan Yang

Along with the evolution of music technology, a large number of styles, or "subgenres," of Electronic Dance Music(EDM) have emerged in recent years.

Classification Genre classification +2

KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE using Mel-spectrograms

no code implementations8 Oct 2021 Chien-Feng Liao, Jen-Yu Liu, Yi-Hsuan Yang

In this paper, we propose a novel neural network model called KaraSinger for a less-studied singing voice synthesis (SVS) task named score-free SVS, in which the prosody and melody are spontaneously decided by machine.

Language Modelling Singing Voice Synthesis

Variable-Length Music Score Infilling via XLNet and Musically Specialized Positional Encoding

1 code implementation11 Aug 2021 Chin-Jui Chang, Chun-Yi Lee, Yi-Hsuan Yang

This paper proposes a new self-attention based model for music score infilling, i. e., to generate a polyphonic music sequence that fills in the gap between given past and future contexts.

DadaGP: A Dataset of Tokenized GuitarPro Songs for Sequence Models

1 code implementation30 Jul 2021 Pedro Sarmento, Adarsh Kumar, CJ Carr, Zack Zukowski, Mathieu Barthet, Yi-Hsuan Yang

In this work, we present DadaGP, a new symbolic music dataset comprising 26, 181 song scores in the GuitarPro format covering 739 musical genres, along with an accompanying tokenized format well-suited for generative sequence models such as the Transformer.

Decoder Genre classification +3

BERT-like Pre-training for Symbolic Piano Music Classification Tasks

1 code implementation12 Jul 2021 Yi-Hui Chou, I-Chun Chen, Chin-Jui Chang, Joann Ching, Yi-Hsuan Yang

This article presents a benchmark study of symbolic piano music classification using the masked language modelling approach of the Bidirectional Encoder Representations from Transformers (BERT).

Classification Emotion Classification +3

Source Separation-based Data Augmentation for Improved Joint Beat and Downbeat Tracking

1 code implementation16 Jun 2021 Ching-Yu Chiu, Joann Ching, Wen-Yi Hsiao, Yu-Hua Chen, Alvin Wen-Yu Su, Yi-Hsuan Yang

Due to advances in deep learning, the performance of automatic beat and downbeat tracking in musical audio signals has seen great improvement in recent years.

Data Augmentation

Drum-Aware Ensemble Architecture for Improved Joint Musical Beat and Downbeat Tracking

1 code implementation16 Jun 2021 Ching-Yu Chiu, Alvin Wen-Yu Su, Yi-Hsuan Yang

This paper presents a novel system architecture that integrates blind source separation with joint beat and downbeat tracking in musical audio signals.

blind source separation

MuseMorphose: Full-Song and Fine-Grained Piano Music Style Transfer with One Transformer VAE

1 code implementation10 May 2021 Shih-Lun Wu, Yi-Hsuan Yang

Transformers and variational autoencoders (VAE) have been extensively employed for symbolic (e. g., MIDI) domain music generation.

Decoder Music Generation +2

Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs

4 code implementations7 Jan 2021 Wen-Yi Hsiao, Jen-Yu Liu, Yin-Cheng Yeh, Yi-Hsuan Yang

In this paper, we present a conceptually different approach that explicitly takes into account the type of the tokens, such as note types and metric types.

Music Generation

The Freesound Loop Dataset and Annotation Tool

1 code implementation26 Aug 2020 Antonio Ramires, Frederic Font, Dmitry Bogdanov, Jordan B. L. Smith, Yi-Hsuan Yang, Joann Ching, Bo-Yu Chen, Yueh-Kao Wu, Hsu Wei-Han, Xavier Serra

We present the Freesound Loop Dataset (FSLD), a new large-scale dataset of music loops annotated by experts.

Audio and Speech Processing Sound

A Computational Analysis of Real-World DJ Mixes using Mix-To-Track Subsequence Alignment

1 code implementation24 Aug 2020 Taejun Kim, Minsuk Choi, Evan Sacks, Yi-Hsuan Yang, Juhan Nam

A DJ mix is a sequence of music tracks concatenated seamlessly, typically rendered for audiences in a live setting by a DJ on stage.

Mixing-Specific Data Augmentation Techniques for Improved Blind Violin/Piano Source Separation

1 code implementation6 Aug 2020 Ching-Yu Chiu, Wen-Yi Hsiao, Yin-Cheng Yeh, Yi-Hsuan Yang, Alvin Wen-Yu Su

Blind music source separation has been a popular and active subject of research in both the music information retrieval and signal processing communities.

Data Augmentation Information Retrieval +3

Neural Loop Combiner: Neural Network Models for Assessing the Compatibility of Loops

1 code implementation5 Aug 2020 Bo-Yu Chen, Jordan B. L. Smith, Yi-Hsuan Yang

Music producers who use loops may have access to thousands in loop libraries, but finding ones that are compatible is a time-consuming process; we hope to reduce this burden with automation.

The Jazz Transformer on the Front Line: Exploring the Shortcomings of AI-composed Music through Quantitative Measures

2 code implementations4 Aug 2020 Shih-Lun Wu, Yi-Hsuan Yang

This paper presents the Jazz Transformer, a generative model that utilizes a neural sequence model called the Transformer-XL for modeling lead sheets of Jazz music.

Automatic Composition of Guitar Tabs by Transformers and Groove Modeling

no code implementations4 Aug 2020 Yu-Hua Chen, Yu-Hsiang Huang, Wen-Yi Hsiao, Yi-Hsuan Yang

Deep learning algorithms are increasingly developed for learning to compose music in the form of MIDI files.

Sound Audio and Speech Processing

Speech-to-Singing Conversion based on Boundary Equilibrium GAN

no code implementations28 May 2020 Da-Yi Wu, Yi-Hsuan Yang

Specifically, given a speech input, and optionally the F0 contour of the target singing, the proposed model generates as the output a singing signal with a progressive-growing encoder/decoder architecture and boundary equilibrium GAN loss functions.

Decoder Generative Adversarial Network +1

Unconditional Audio Generation with Generative Adversarial Networks and Cycle Regularization

1 code implementation18 May 2020 Jen-Yu Liu, Yu-Hua Chen, Yin-Cheng Yeh, Yi-Hsuan Yang

Audio examples, as well as the code for implementing our model, will be publicly available online upon paper publication.

Audio Generation Generative Adversarial Network

A Comparative Study of Western and Chinese Classical Music based on Soundscape Models

no code implementations20 Feb 2020 Jianyu Fan, Yi-Hsuan Yang, Kui Dong, Philippe Pasquier

In this study, we examine whether we can analyze and compare Western and Chinese classical music based on soundscape models.

Emotion Recognition Event Detection +3

Addressing the confounds of accompaniments in singer identification

1 code implementation17 Feb 2020 Tsung-Han Hsieh, Kai-Hsiang Cheng, Zhe-Cheng Fan, Yu-Ching Yang, Yi-Hsuan Yang

A singer identification model may learn to extract non-vocal related features from the instrumental part of the songs, if a singer only sings in certain musical contexts (e. g., genres).

Data Augmentation Singer Identification

Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions

7 code implementations1 Feb 2020 Yu-Siang Huang, Yi-Hsuan Yang

In contrast with this general approach, this paper shows that Transformers can do even better for music modeling, when we improve the way a musical score is converted into the data fed to a Transformer model.

Music Modeling

Automatic Melody Harmonization with Triad Chords: A Comparative Study

no code implementations8 Jan 2020 Yin-Cheng Yeh, Wen-Yi Hsiao, Satoru Fukayama, Tetsuro Kitahara, Benjamin Genchel, Hao-Min Liu, Hao-Wen Dong, Yi-An Chen, Terence Leong, Yi-Hsuan Yang

Several prior works have proposed various methods for the task of automatic melody harmonization, in which a model aims to generate a sequence of chords to serve as the harmonic accompaniment of a given multiple-bar melody sequence.

Template Matching

Score and Lyrics-Free Singing Voice Generation

1 code implementation26 Dec 2019 Jen-Yu Liu, Yu-Hua Chen, Yin-Cheng Yeh, Yi-Hsuan Yang

Generative models for singing voice have been mostly concerned with the task of ``singing voice synthesis,'' i. e., to produce singing voice waveforms given musical scores and text lyrics.

Audio Generation Singing Voice Synthesis

Dilated Convolution with Dilated GRU for Music Source Separation

no code implementations4 Jun 2019 Jen-Yu Liu, Yi-Hsuan Yang

To reach information at remote locations, we propose to combine dilated convolution with a modified version of gated recurrent units (GRU) called the `Dilated GRU' to form a block.

Music Source Separation

Musical Composition Style Transfer via Disentangled Timbre Representations

1 code implementation30 May 2019 Yun-Ning Hung, I-Tung Chiang, Yi-An Chen, Yi-Hsuan Yang

We investigate disentanglement techniques such as adversarial training to separate latent factors that are related to the musical content (pitch) of different parts of the piece, and that are related to the instrumentation (timbre) of the parts per short-time segment.

Audio and Speech Processing Sound

Collaborative Similarity Embedding for Recommender Systems

2 code implementations17 Feb 2019 Chih-Ming Chen, Chuan-Ju Wang, Ming-Feng Tsai, Yi-Hsuan Yang

We present collaborative similarity embedding (CSE), a unified framework that exploits comprehensive collaborative relations available in a user-item bipartite graph for representation learning and recommendation.

Graph Learning Recommendation Systems +1

A Minimal Template for Interactive Web-based Demonstrations of Musical Machine Learning

1 code implementation11 Feb 2019 Vibert Thio, Hao-Min Liu, Yin-Cheng Yeh, Yi-Hsuan Yang

New machine learning algorithms are being developed to solve problems in different areas, including music.

Human-Computer Interaction D.2.2; H.5.2; H.5.5

On Output Activation Functions for Adversarial Losses: A Theoretical Analysis via Variational Divergence Minimization and An Empirical Study on MNIST Classification

1 code implementation25 Jan 2019 Hao-Wen Dong, Yi-Hsuan Yang

2) How different combinations of output activation functions and regularization approaches perform empirically against one another?

Learning to match transient sound events using attentional similarity for few-shot sound recognition

1 code implementation4 Dec 2018 Szu-Yu Chou, Kai-Hsiang Cheng, Jyh-Shing Roger Jang, Yi-Hsuan Yang

In this paper, we introduce a novel attentional similarity module for the problem of few-shot sound recognition.

Sound Audio and Speech Processing

PerformanceNet: Score-to-Audio Music Generation with Multi-Band Convolutional Residual Network

1 code implementation11 Nov 2018 Bryan Wang, Yi-Hsuan Yang

To build such an AI performer, we propose in this paper a deep convolutional model that learns in an end-to-end manner the score-to-audio mapping between a symbolic representation of music called the piano rolls and an audio representation of music called the spectrograms.

Sound Multimedia Audio and Speech Processing

A Streamlined Encoder/Decoder Architecture for Melody Extraction

1 code implementation30 Oct 2018 Tsung-Han Hsieh, Li Su, Yi-Hsuan Yang

Our experiments on both vocal melody extraction and general melody extraction validate the effectiveness of the proposed model.

Decoder Melody Extraction

Training Generative Adversarial Networks with Binary Neurons by End-to-end Backpropagation

1 code implementation10 Oct 2018 Hao-Wen Dong, Yi-Hsuan Yang

We propose the BinaryGAN, a novel generative adversarial network (GAN) that uses binary neurons at the output layer of the generator.

Generative Adversarial Network

Lead Sheet Generation and Arrangement by Conditional Generative Adversarial Network

2 code implementations30 Jul 2018 Hao-Min Liu, Yi-Hsuan Yang

A new recurrent convolutional generative model for the task is proposed, along with three new symbolic-domain harmonic features to facilitate learning from unpaired lead sheets and MIDIs.

Generative Adversarial Network Music Generation

Weakly-supervised Visual Instrument-playing Action Detection in Videos

1 code implementation5 May 2018 Jen-Yu Liu, Yi-Hsuan Yang, Shyh-Kang Jeng

Instrument playing is among the most common scenes in music-related videos, which represent nowadays one of the largest sources of online videos.

Action Detection

Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation

3 code implementations25 Apr 2018 Hao-Wen Dong, Yi-Hsuan Yang

Experimental results show that using binary neurons instead of HT or BS indeed leads to better results in a number of objective measures.

Music Generation

Pop Music Highlighter: Marking the Emotion Keypoints

1 code implementation28 Feb 2018 Yu-Siang Huang, Szu-Yu Chou, Yi-Hsuan Yang

In a previous work, we introduced an attention-based convolutional recurrent neural network that uses music emotion classification as a surrogate task for music highlight extraction, for Pop songs.

Emotion Classification

Complex and Quaternionic Principal Component Pursuit and Its Application to Audio Separation

no code implementations9 Jan 2018 Tak-Shing T. Chan, Yi-Hsuan Yang

Thus, in this letter, we extend principal component pursuit to the complex and quaternionic cases to account for the missing phase information.

Informed Group-Sparse Representation for Singing Voice Separation

no code implementations9 Jan 2018 Tak-Shing T. Chan, Yi-Hsuan Yang

Singing voice separation attempts to separate the vocal and instrumental parts of a music recording, which is a fundamental problem in music information retrieval.

Information Retrieval Music Information Retrieval +1

Polar $n$-Complex and $n$-Bicomplex Singular Value Decomposition and Principal Component Pursuit

no code implementations9 Jan 2018 Tak-Shing T. Chan, Yi-Hsuan Yang

Informed by recent work on tensor singular value decomposition and circulant algebra matrices, this paper presents a new theoretical bridge that unifies the hypercomplex and tensor-based approaches to singular value decomposition and robust principal component analysis.

Vertex-Context Sampling for Weighted Network Embedding

no code implementations1 Nov 2017 Chih-Ming Chen, Yi-Hsuan Yang, Yi-An Chen, Ming-Feng Tsai

Many existing methods adopt a uniform sampling method to reduce learning complexity, but when the network is non-uniform (i. e. a weighted network) such uniform sampling incurs information loss.

Information Retrieval Multi-Label Classification +3

Hit Song Prediction for Pop Music by Siamese CNN with Ranking Loss

2 code implementations30 Oct 2017 Lang-Chi Yu, Yi-Hsuan Yang, Yun-Ning Hung, Yi-An Chen

A model for hit song prediction can be used in the pop music industry to identify emerging trends and potential artists or songs before they are marketed to the public.


MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment

8 code implementations19 Sep 2017 Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang, Yi-Hsuan Yang

The three models, which differ in the underlying assumptions and accordingly the network architectures, are referred to as the jamming model, the composer model and the hybrid model.

Music Generation

Generating Music Medleys via Playing Music Puzzle Games

no code implementations13 Sep 2017 Yu-Siang Huang, Szu-Yu Chou, Yi-Hsuan Yang

Generating music medleys is about finding an optimal permutation of a given set of music clips.

Self-Supervised Learning

Revisiting the problem of audio-based hit song prediction using convolutional neural networks

no code implementations5 Apr 2017 Li-Chia Yang, Szu-Yu Chou, Jen-Yu Liu, Yi-Hsuan Yang, Yi-An Chen

Being able to predict whether a song can be a hit has impor- tant applications in the music industry.

MidiNet: A Convolutional Generative Adversarial Network for Symbolic-domain Music Generation

4 code implementations31 Mar 2017 Li-Chia Yang, Szu-Yu Chou, Yi-Hsuan Yang

We conduct a user study to compare the melody of eight-bar long generated by MidiNet and by Google's MelodyRNN models, each time using the same priming melody.

Generative Adversarial Network Music Generation

Applying Topological Persistence in Convolutional Neural Network for Music Audio Signals

no code implementations26 Aug 2016 Jen-Yu Liu, Shyh-Kang Jeng, Yi-Hsuan Yang

Recent years have witnessed an increased interest in the application of persistent homology, a topological tool for data analysis, to machine learning problems.

Multi-Label Classification Music Tagging

Neural Network Based Next-Song Recommendation

no code implementations24 Jun 2016 Kai-Chun Hsu, Szu-Yu Chou, Yi-Hsuan Yang, Tai-Shih Chi

The utilization of sequential patterns has boosted performance on several kinds of recommendation tasks.

Recommendation Systems

Cannot find the paper you are looking for? You can Submit a new open access paper.