Search Results for author: Yi-Hsuan Yang

Found 61 papers, 43 papers with code

Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation

3 code implementations • 25 Apr 2018 • Hao-Wen Dong, Yi-Hsuan Yang

Experimental results show that using binary neurons instead of HT or BS indeed leads to better results in a number of objective measures.

Music Generation

1,704

Paper
Code

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment

8 code implementations • 19 Sep 2017 • Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang, Yi-Hsuan Yang

The three models, which differ in the underlying assumptions and accordingly the network architectures, are referred to as the jamming model, the composer model and the hybrid model.

Music Generation

1,704

Paper
Code

Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions

7 code implementations • 1 Feb 2020 • Yu-Siang Huang, Yi-Hsuan Yang

In contrast with this general approach, this paper shows that Transformers can do even better for music modeling, when we improve the way a musical score is converted into the data fed to a Transformer model.

Music Modeling

579

Paper
Code

Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs

4 code implementations • 7 Jan 2021 • Wen-Yi Hsiao, Jen-Yu Liu, Yin-Cheng Yeh, Yi-Hsuan Yang

In this paper, we present a conceptually different approach that explicitly takes into account the type of the tokens, such as note types and metric types.

Music Generation

579

Paper
Code

Collaborative Similarity Embedding for Recommender Systems

2 code implementations • 17 Feb 2019 • Chih-Ming Chen, Chuan-Ju Wang, Ming-Feng Tsai, Yi-Hsuan Yang

We present collaborative similarity embedding (CSE), a unified framework that exploits comprehensive collaborative relations available in a user-item bipartite graph for representation learning and recommendation.

Ranked #1 on Recommendation Systems on MovieLens-Latest

Graph Learning Recommendation Systems +1

374

Paper
Code

MidiNet: A Convolutional Generative Adversarial Network for Symbolic-domain Music Generation

4 code implementations • 31 Mar 2017 • Li-Chia Yang, Szu-Yu Chou, Yi-Hsuan Yang

We conduct a user study to compare the melody of eight-bar long generated by MidiNet and by Google's MelodyRNN models, each time using the same priming melody.

Generative Adversarial Network Music Generation

174

Paper
Code

MuseMorphose: Full-Song and Fine-Grained Piano Music Style Transfer with One Transformer VAE

1 code implementation • 10 May 2021 • Shih-Lun Wu, Yi-Hsuan Yang

Transformers and variational autoencoders (VAE) have been extensively employed for symbolic (e. g., MIDI) domain music generation.

Music Generation Music Style Transfer +1

169

Paper
Code

MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding

1 code implementation • 12 Jul 2021 • Yi-Hui Chou, I-Chun Chen, Chin-Jui Chang, Joann Ching, Yi-Hsuan Yang

This paper presents an attempt to employ the mask language modeling approach of BERT to pre-train a 12-layer Transformer model over 4, 166 pieces of polyphonic piano MIDI files for tackling a number of symbolic-domain discriminative music understanding tasks.

Classification Emotion Classification +2

165

Paper
Code

Theme Transformer: Symbolic Music Generation with Theme-Conditioned Transformer

1 code implementation • 7 Nov 2021 • Yi-Jen Shih, Shih-Lun Wu, Frank Zalkow, Meinard Müller, Yi-Hsuan Yang

To condition the generation process of such a model with a user-specified sequence, a popular approach is to take that conditioning sequence as a priming sequence and ask a Transformer decoder to generate a continuation.

Music Generation Representation Learning Sound Multimedia Audio and Speech Processing

119

Paper
Code

The Jazz Transformer on the Front Line: Exploring the Shortcomings of AI-composed Music through Quantitative Measures

2 code implementations • 4 Aug 2020 • Shih-Lun Wu, Yi-Hsuan Yang

This paper presents the Jazz Transformer, a generative model that utilizes a neural sequence model called the Transformer-XL for modeling lead sheets of Jazz music.

118

Paper
Code

DadaGP: A Dataset of Tokenized GuitarPro Songs for Sequence Models

1 code implementation • 30 Jul 2021 • Pedro Sarmento, Adarsh Kumar, CJ Carr, Zack Zukowski, Mathieu Barthet, Yi-Hsuan Yang

In this work, we present DadaGP, a new symbolic music dataset comprising 26, 181 song scores in the GuitarPro format covering 739 musical genres, along with an accompanying tokenized format well-suited for generative sequence models such as the Transformer.

Genre classification Music Generation +2

107

Paper
Code

PerformanceNet: Score-to-Audio Music Generation with Multi-Band Convolutional Residual Network

1 code implementation • 11 Nov 2018 • Bryan Wang, Yi-Hsuan Yang

To build such an AI performer, we propose in this paper a deep convolutional model that learns in an end-to-end manner the score-to-audio mapping between a symbolic representation of music called the piano rolls and an audio representation of music called the spectrograms.

Sound Multimedia Audio and Speech Processing

106

Paper
Code

Pop Music Highlighter: Marking the Emotion Keypoints

1 code implementation • 28 Feb 2018 • Yu-Siang Huang, Szu-Yu Chou, Yi-Hsuan Yang

In a previous work, we introduced an attention-based convolutional recurrent neural network that uses music emotion classification as a surrogate task for music highlight extraction, for Pop songs.

Emotion Classification

Paper
Code

Unconditional Audio Generation with Generative Adversarial Networks and Cycle Regularization

1 code implementation • 18 May 2020 • Jen-Yu Liu, Yu-Hua Chen, Yin-Cheng Yeh, Yi-Hsuan Yang

Audio examples, as well as the code for implementing our model, will be publicly available online upon paper publication.

Audio Generation Generative Adversarial Network

Paper
Code

Lead Sheet Generation and Arrangement by Conditional Generative Adversarial Network

2 code implementations • 30 Jul 2018 • Hao-Min Liu, Yi-Hsuan Yang

A new recurrent convolutional generative model for the task is proposed, along with three new symbolic-domain harmonic features to facilitate learning from unpaired lead sheets and MIDIs.

Generative Adversarial Network Music Generation

Paper
Code

A Streamlined Encoder/Decoder Architecture for Melody Extraction

1 code implementation • 30 Oct 2018 • Tsung-Han Hsieh, Li Su, Yi-Hsuan Yang

Our experiments on both vocal melody extraction and general melody extraction validate the effectiveness of the proposed model.

Melody Extraction

Paper
Code

Relative Positional Encoding for Transformers with Linear Complexity

1 code implementation • 18 May 2021 • Antoine Liutkus, Ondřej Cífka, Shih-Lun Wu, Umut Şimşekli, Yi-Hsuan Yang, Gaël Richard

Recent advances in Transformer models allow for unprecedented sequence lengths, due to linear space and time complexity.

Gaussian Processes Image Classification +2

Paper
Code

Speech-to-Singing Conversion in an Encoder-Decoder Framework

1 code implementation • 16 Feb 2020 • Jayneel Parekh, Preeti Rao, Yi-Hsuan Yang

In this paper our goal is to convert a set of spoken lines into sung ones.

Multi-Task Learning

Paper
Code

On Output Activation Functions for Adversarial Losses: A Theoretical Analysis via Variational Divergence Minimization and An Empirical Study on MNIST Classification

1 code implementation • 25 Jan 2019 • Hao-Wen Dong, Yi-Hsuan Yang

2) How different combinations of output activation functions and regularization approaches perform empirically against one another?

Paper
Code

Singing Style Transfer Using Cycle-Consistent Boundary Equilibrium Generative Adversarial Networks

2 code implementations • 6 Jul 2018 • Cheng-Wei Wu, Jen-Yu Liu, Yi-Hsuan Yang, Jyh-Shing R. Jang

Can we make a famous rap singer like Eminem sing whatever our favorite song?

Style Transfer

Paper
Code

Variable-Length Music Score Infilling via XLNet and Musically Specialized Positional Encoding

1 code implementation • 11 Aug 2021 • Chin-Jui Chang, Chun-Yi Lee, Yi-Hsuan Yang

This paper proposes a new self-attention based model for music score infilling, i. e., to generate a polyphonic music sequence that fills in the gap between given past and future contexts.

Paper
Code

A Computational Analysis of Real-World DJ Mixes using Mix-To-Track Subsequence Alignment

1 code implementation • 24 Aug 2020 • Taejun Kim, Minsuk Choi, Evan Sacks, Yi-Hsuan Yang, Juhan Nam

A DJ mix is a sequence of music tracks concatenated seamlessly, typically rendered for audiences in a live setting by a DJ on stage.

Paper
Code

Learning to match transient sound events using attentional similarity for few-shot sound recognition

1 code implementation • 4 Dec 2018 • Szu-Yu Chou, Kai-Hsiang Cheng, Jyh-Shing Roger Jang, Yi-Hsuan Yang

In this paper, we introduce a novel attentional similarity module for the problem of few-shot sound recognition.

Sound Audio and Speech Processing

Paper
Code

Neural Loop Combiner: Neural Network Models for Assessing the Compatibility of Loops

1 code implementation • 5 Aug 2020 • Bo-Yu Chen, Jordan B. L. Smith, Yi-Hsuan Yang

Music producers who use loops may have access to thousands in loop libraries, but finding ones that are compatible is a time-consuming process; we hope to reduce this burden with automation.

Paper
Code

JukeDrummer: Conditional Beat-aware Audio-domain Drum Accompaniment Generation via Transformer VQ-VAE

1 code implementation • 12 Oct 2022 • Yueh-Kao Wu, Ching-Yu Chiu, Yi-Hsuan Yang

Instead of generating the drum track directly as waveforms, we use a separate VQ-VAE to encode the mel-spectrogram of a drum track into another set of discrete codes, and train the Transformer to predict the sequence of drum-related discrete codes.

Paper
Code

Score and Lyrics-Free Singing Voice Generation

1 code implementation • 26 Dec 2019 • Jen-Yu Liu, Yu-Hua Chen, Yin-Cheng Yeh, Yi-Hsuan Yang

Generative models for singing voice have been mostly concerned with the task of ``singing voice synthesis,'' i. e., to produce singing voice waveforms given musical scores and text lyrics.

Audio Generation Singing Voice Synthesis

Paper
Code

Training Generative Adversarial Networks with Binary Neurons by End-to-end Backpropagation

1 code implementation • 10 Oct 2018 • Hao-Wen Dong, Yi-Hsuan Yang

We propose the BinaryGAN, a novel generative adversarial network (GAN) that uses binary neurons at the output layer of the generator.

Generative Adversarial Network

Paper
Code

Musical Composition Style Transfer via Disentangled Timbre Representations

1 code implementation • 30 May 2019 • Yun-Ning Hung, I-Tung Chiang, Yi-An Chen, Yi-Hsuan Yang

We investigate disentanglement techniques such as adversarial training to separate latent factors that are related to the musical content (pitch) of different parts of the piece, and that are related to the instrumentation (timbre) of the parts per short-time segment.

Audio and Speech Processing Sound

Paper
Code

Compose & Embellish: Well-Structured Piano Performance Generation via A Two-Stage Approach

1 code implementation • 17 Sep 2022 • Shih-Lun Wu, Yi-Hsuan Yang

Even with strong sequence models like Transformers, generating expressive piano performances with long-range musical structures remains challenging.

Paper
Code

A Minimal Template for Interactive Web-based Demonstrations of Musical Machine Learning

1 code implementation • 11 Feb 2019 • Vibert Thio, Hao-Min Liu, Yin-Cheng Yeh, Yi-Hsuan Yang

New machine learning algorithms are being developed to solve problems in different areas, including music.

Human-Computer Interaction D.2.2; H.5.2; H.5.5

Paper
Code

The Freesound Loop Dataset and Annotation Tool

1 code implementation • 26 Aug 2020 • Antonio Ramires, Frederic Font, Dmitry Bogdanov, Jordan B. L. Smith, Yi-Hsuan Yang, Joann Ching, Bo-Yu Chen, Yueh-Kao Wu, Hsu Wei-Han, Xavier Serra

We present the Freesound Loop Dataset (FSLD), a new large-scale dataset of music loops annotated by experts.

Audio and Speech Processing Sound

Paper
Code

Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Features

2 code implementations • 17 Oct 2021 • Wei-Han Hsu, Bo-Yu Chen, Yi-Hsuan Yang

Along with the evolution of music technology, a large number of styles, or "subgenres," of Electronic Dance Music(EDM) have emerged in recent years.

Classification Genre classification +2

Paper
Code

Addressing the confounds of accompaniments in singer identification

1 code implementation • 17 Feb 2020 • Tsung-Han Hsieh, Kai-Hsiang Cheng, Zhe-Cheng Fan, Yu-Ching Yang, Yi-Hsuan Yang

A singer identification model may learn to extract non-vocal related features from the instrumental part of the songs, if a singer only sings in certain musical contexts (e. g., genres).

Data Augmentation Singer Identification

Paper
Code

Source Separation-based Data Augmentation for Improved Joint Beat and Downbeat Tracking

1 code implementation • 16 Jun 2021 • Ching-Yu Chiu, Joann Ching, Wen-Yi Hsiao, Yu-Hua Chen, Alvin Wen-Yu Su, Yi-Hsuan Yang

Due to advances in deep learning, the performance of automatic beat and downbeat tracking in musical audio signals has seen great improvement in recent years.

Data Augmentation

Paper
Code

Learning To Generate Piano Music With Sustain Pedals

1 code implementation • 1 Nov 2021 • Joann Ching, Yi-Hsuan Yang

Recent years have witnessed a growing interest in research related to the detection of piano pedals from audio signals in the music information retrieval community.

Information Retrieval Music Information Retrieval +1

Paper
Code

Mixing-Specific Data Augmentation Techniques for Improved Blind Violin/Piano Source Separation

1 code implementation • 6 Aug 2020 • Ching-Yu Chiu, Wen-Yi Hsiao, Yin-Cheng Yeh, Yi-Hsuan Yang, Alvin Wen-Yu Su

Blind music source separation has been a popular and active subject of research in both the music information retrieval and signal processing communities.

Data Augmentation Information Retrieval +3

Paper
Code

Weakly-supervised Visual Instrument-playing Action Detection in Videos

1 code implementation • 5 May 2018 • Jen-Yu Liu, Yi-Hsuan Yang, Shyh-Kang Jeng

Instrument playing is among the most common scenes in music-related videos, which represent nowadays one of the largest sources of online videos.

Action Detection

Paper
Code

Drum-Aware Ensemble Architecture for Improved Joint Musical Beat and Downbeat Tracking

1 code implementation • 16 Jun 2021 • Ching-Yu Chiu, Alvin Wen-Yu Su, Yi-Hsuan Yang

This paper presents a novel system architecture that integrates blind source separation with joint beat and downbeat tracking in musical audio signals.

blind source separation

Paper
Code

Local Periodicity-Based Beat Tracking for Expressive Classical Piano Music

1 code implementation • 20 Aug 2023 • Ching-Yu Chiu, Meinard Müller, Matthew E. P. Davies, Alvin Wen-Yu Su, Yi-Hsuan Yang

To model the periodicity of beats, state-of-the-art beat tracking systems use "post-processing trackers" (PPTs) that rely on several empirically determined global assumptions for tempo transition, which work well for music with a steady tempo.

Paper
Code

Melody Infilling with User-Provided Structural Context

1 code implementation • 6 Oct 2022 • Chih-Pin Tan, Alvin W. Y. Su, Yi-Hsuan Yang

This paper proposes a novel Transformer-based model for music score infilling, to generate a music passage that fills in the gap between given past and future contexts.

Paper
Code

Hit Song Prediction for Pop Music by Siamese CNN with Ranking Loss

2 code implementations • 30 Oct 2017 • Lang-Chi Yu, Yi-Hsuan Yang, Yun-Ning Hung, Yi-An Chen

A model for hit song prediction can be used in the pop music industry to identify emerging trends and potential artists or songs before they are marketed to the public.

regression

Paper
Code

An Analysis Method for Metric-Level Switching in Beat Tracking

1 code implementation • 13 Oct 2022 • Ching-Yu Chiu, Meinard Müller, Matthew E. P. Davies, Alvin Wen-Yu Su, Yi-Hsuan Yang

For expressive music, the tempo may change over time, posing challenges to tracking the beats by an automatic model.

Paper
Code

Improving Automatic Jazz Melody Generation by Transfer Learning Techniques

1 code implementation • 26 Aug 2019 • Hsiao-Tzu Hung, Chung-Yang Wang, Yi-Hsuan Yang, Hsin-Min Wang

In this paper, we tackle the problem of transfer learning for Jazz automatic generation.

Music Generation Transfer Learning

Paper
Code

Backpropagation with N-D Vector-Valued Neurons Using Arbitrary Bilinear Products

no code implementations • 24 May 2018 • Zhe-Cheng Fan, Tak-Shing T. Chan, Yi-Hsuan Yang, Jyh-Shing R. Jang

Vector-valued neural learning has emerged as a promising direction in deep learning recently.

Image Denoising

Paper
Add Code

Informed Group-Sparse Representation for Singing Voice Separation

no code implementations • 9 Jan 2018 • Tak-Shing T. Chan, Yi-Hsuan Yang

Singing voice separation attempts to separate the vocal and instrumental parts of a music recording, which is a fundamental problem in music information retrieval.

Information Retrieval Music Information Retrieval +1

Paper
Add Code

Polar $n$-Complex and $n$-Bicomplex Singular Value Decomposition and Principal Component Pursuit

no code implementations • 9 Jan 2018 • Tak-Shing T. Chan, Yi-Hsuan Yang

Informed by recent work on tensor singular value decomposition and circulant algebra matrices, this paper presents a new theoretical bridge that unifies the hypercomplex and tensor-based approaches to singular value decomposition and robust principal component analysis.

Paper
Add Code

Complex and Quaternionic Principal Component Pursuit and Its Application to Audio Separation

no code implementations • 9 Jan 2018 • Tak-Shing T. Chan, Yi-Hsuan Yang

Thus, in this letter, we extend principal component pursuit to the complex and quaternionic cases to account for the missing phase information.

Paper
Add Code

Generating Music Medleys via Playing Music Puzzle Games

no code implementations • 13 Sep 2017 • Yu-Siang Huang, Szu-Yu Chou, Yi-Hsuan Yang

Generating music medleys is about finding an optimal permutation of a given set of music clips.

Self-Supervised Learning

Paper
Add Code

Revisiting the problem of audio-based hit song prediction using convolutional neural networks

no code implementations • 5 Apr 2017 • Li-Chia Yang, Szu-Yu Chou, Jen-Yu Liu, Yi-Hsuan Yang, Yi-An Chen

Being able to predict whether a song can be a hit has impor- tant applications in the music industry.

Paper
Add Code

Applying Topological Persistence in Convolutional Neural Network for Music Audio Signals

no code implementations • 26 Aug 2016 • Jen-Yu Liu, Shyh-Kang Jeng, Yi-Hsuan Yang

Recent years have witnessed an increased interest in the application of persistent homology, a topological tool for data analysis, to machine learning problems.

Multi-Label Classification Music Tagging

Paper
Add Code

Neural Network Based Next-Song Recommendation

no code implementations • 24 Jun 2016 • Kai-Chun Hsu, Szu-Yu Chou, Yi-Hsuan Yang, Tai-Shih Chi

The utilization of sequential patterns has boosted performance on several kinds of recommendation tasks.

Recommendation Systems

Paper
Add Code

Vertex-Context Sampling for Weighted Network Embedding

no code implementations • 1 Nov 2017 • Chih-Ming Chen, Yi-Hsuan Yang, Yi-An Chen, Ming-Feng Tsai

Many existing methods adopt a uniform sampling method to reduce learning complexity, but when the network is non-uniform (i. e. a weighted network) such uniform sampling incurs information loss.

Information Retrieval Multi-Label Classification +3

Paper
Add Code

Denoising Auto-encoder with Recurrent Skip Connections and Residual Regression for Music Source Separation

no code implementations • 5 Jul 2018 • Jen-Yu Liu, Yi-Hsuan Yang

In this work, we propose a denoising Auto-encoder with Recurrent skip Connections (ARC).

Denoising Music Source Separation +1

Paper
Add Code

Dilated Convolution with Dilated GRU for Music Source Separation

no code implementations • 4 Jun 2019 • Jen-Yu Liu, Yi-Hsuan Yang

To reach information at remote locations, we propose to combine dilated convolution with a modified version of gated recurrent units (GRU) called the `Dilated GRU' to form a block.

Music Source Separation

Paper
Add Code

Automatic Melody Harmonization with Triad Chords: A Comparative Study

no code implementations • 8 Jan 2020 • Yin-Cheng Yeh, Wen-Yi Hsiao, Satoru Fukayama, Tetsuro Kitahara, Benjamin Genchel, Hao-Min Liu, Hao-Wen Dong, Yi-An Chen, Terence Leong, Yi-Hsuan Yang

Several prior works have proposed various methods for the task of automatic melody harmonization, in which a model aims to generate a sequence of chords to serve as the harmonic accompaniment of a given multiple-bar melody sequence.

Template Matching

Paper
Add Code

A Comparative Study of Western and Chinese Classical Music based on Soundscape Models

no code implementations • 20 Feb 2020 • Jianyu Fan, Yi-Hsuan Yang, Kui Dong, Philippe Pasquier

In this study, we examine whether we can analyze and compare Western and Chinese classical music based on soundscape models.

Emotion Recognition Event Detection +3

Paper
Add Code

Speech-to-Singing Conversion based on Boundary Equilibrium GAN

no code implementations • 28 May 2020 • Da-Yi Wu, Yi-Hsuan Yang

Specifically, given a speech input, and optionally the F0 contour of the target singing, the proposed model generates as the output a singing signal with a progressive-growing encoder/decoder architecture and boundary equilibrium GAN loss functions.

Generative Adversarial Network Style Transfer

Paper
Add Code

Automatic Composition of Guitar Tabs by Transformers and Groove Modeling

no code implementations • 4 Aug 2020 • Yu-Hua Chen, Yu-Hsiang Huang, Wen-Yi Hsiao, Yi-Hsuan Yang

Deep learning algorithms are increasingly developed for learning to compose music in the form of MIDI files.

Sound Audio and Speech Processing

Paper
Add Code

KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE using Mel-spectrograms

no code implementations • 8 Oct 2021 • Chien-Feng Liao, Jen-Yu Liu, Yi-Hsuan Yang

In this paper, we propose a novel neural network model called KaraSinger for a less-studied singing voice synthesis (SVS) task named score-free SVS, in which the prosody and melody are spontaneously decided by machine.

Language Modelling Singing Voice Synthesis

Paper
Add Code

Automatic DJ Transitions with Differentiable Audio Effects and Generative Adversarial Networks

no code implementations • 13 Oct 2021 • Bo-Yu Chen, Wei-Han Hsu, Wei-Hsiang Liao, Marco A. Martínez Ramírez, Yuki Mitsufuji, Yi-Hsuan Yang

A central task of a Disc Jockey (DJ) is to create a mixset of mu-sic with seamless transitions between adjacent tracks.

Generative Adversarial Network

Paper
Add Code

Music Score Expansion with Variable-Length Infilling

no code implementations • 11 Nov 2021 • Chih-Pin Tan, Chin-Jui Chang, Alvin W. Y. Su, Yi-Hsuan Yang

In this paper, we investigate using the variable-length infilling (VLI) model, which is originally proposed to infill missing segments, to "prolong" existing musical segments at musical boundaries.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.