3 code implementations • 25 Apr 2018 • Hao-Wen Dong, Yi-Hsuan Yang
Experimental results show that using binary neurons instead of HT or BS indeed leads to better results in a number of objective measures.
8 code implementations • 19 Sep 2017 • Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang, Yi-Hsuan Yang
The three models, which differ in the underlying assumptions and accordingly the network architectures, are referred to as the jamming model, the composer model and the hybrid model.
7 code implementations • 1 Feb 2020 • Yu-Siang Huang, Yi-Hsuan Yang
In contrast with this general approach, this paper shows that Transformers can do even better for music modeling, when we improve the way a musical score is converted into the data fed to a Transformer model.
4 code implementations • 7 Jan 2021 • Wen-Yi Hsiao, Jen-Yu Liu, Yin-Cheng Yeh, Yi-Hsuan Yang
In this paper, we present a conceptually different approach that explicitly takes into account the type of the tokens, such as note types and metric types.
2 code implementations • 17 Feb 2019 • Chih-Ming Chen, Chuan-Ju Wang, Ming-Feng Tsai, Yi-Hsuan Yang
We present collaborative similarity embedding (CSE), a unified framework that exploits comprehensive collaborative relations available in a user-item bipartite graph for representation learning and recommendation.
Ranked #1 on Recommendation Systems on MovieLens-Latest
4 code implementations • 31 Mar 2017 • Li-Chia Yang, Szu-Yu Chou, Yi-Hsuan Yang
We conduct a user study to compare the melody of eight-bar long generated by MidiNet and by Google's MelodyRNN models, each time using the same priming melody.
1 code implementation • 10 May 2021 • Shih-Lun Wu, Yi-Hsuan Yang
Transformers and variational autoencoders (VAE) have been extensively employed for symbolic (e. g., MIDI) domain music generation.
1 code implementation • 12 Jul 2021 • Yi-Hui Chou, I-Chun Chen, Chin-Jui Chang, Joann Ching, Yi-Hsuan Yang
This paper presents an attempt to employ the mask language modeling approach of BERT to pre-train a 12-layer Transformer model over 4, 166 pieces of polyphonic piano MIDI files for tackling a number of symbolic-domain discriminative music understanding tasks.
1 code implementation • 7 Nov 2021 • Yi-Jen Shih, Shih-Lun Wu, Frank Zalkow, Meinard Müller, Yi-Hsuan Yang
To condition the generation process of such a model with a user-specified sequence, a popular approach is to take that conditioning sequence as a priming sequence and ask a Transformer decoder to generate a continuation.
Music Generation Representation Learning Sound Multimedia Audio and Speech Processing
2 code implementations • 4 Aug 2020 • Shih-Lun Wu, Yi-Hsuan Yang
This paper presents the Jazz Transformer, a generative model that utilizes a neural sequence model called the Transformer-XL for modeling lead sheets of Jazz music.
1 code implementation • 30 Jul 2021 • Pedro Sarmento, Adarsh Kumar, CJ Carr, Zack Zukowski, Mathieu Barthet, Yi-Hsuan Yang
In this work, we present DadaGP, a new symbolic music dataset comprising 26, 181 song scores in the GuitarPro format covering 739 musical genres, along with an accompanying tokenized format well-suited for generative sequence models such as the Transformer.
1 code implementation • 11 Nov 2018 • Bryan Wang, Yi-Hsuan Yang
To build such an AI performer, we propose in this paper a deep convolutional model that learns in an end-to-end manner the score-to-audio mapping between a symbolic representation of music called the piano rolls and an audio representation of music called the spectrograms.
Sound Multimedia Audio and Speech Processing
1 code implementation • 28 Feb 2018 • Yu-Siang Huang, Szu-Yu Chou, Yi-Hsuan Yang
In a previous work, we introduced an attention-based convolutional recurrent neural network that uses music emotion classification as a surrogate task for music highlight extraction, for Pop songs.
1 code implementation • 18 May 2020 • Jen-Yu Liu, Yu-Hua Chen, Yin-Cheng Yeh, Yi-Hsuan Yang
Audio examples, as well as the code for implementing our model, will be publicly available online upon paper publication.
2 code implementations • 30 Jul 2018 • Hao-Min Liu, Yi-Hsuan Yang
A new recurrent convolutional generative model for the task is proposed, along with three new symbolic-domain harmonic features to facilitate learning from unpaired lead sheets and MIDIs.
1 code implementation • 30 Oct 2018 • Tsung-Han Hsieh, Li Su, Yi-Hsuan Yang
Our experiments on both vocal melody extraction and general melody extraction validate the effectiveness of the proposed model.
1 code implementation • 18 May 2021 • Antoine Liutkus, Ondřej Cífka, Shih-Lun Wu, Umut Şimşekli, Yi-Hsuan Yang, Gaël Richard
Recent advances in Transformer models allow for unprecedented sequence lengths, due to linear space and time complexity.
1 code implementation • 16 Feb 2020 • Jayneel Parekh, Preeti Rao, Yi-Hsuan Yang
In this paper our goal is to convert a set of spoken lines into sung ones.
1 code implementation • 25 Jan 2019 • Hao-Wen Dong, Yi-Hsuan Yang
2) How different combinations of output activation functions and regularization approaches perform empirically against one another?
2 code implementations • 6 Jul 2018 • Cheng-Wei Wu, Jen-Yu Liu, Yi-Hsuan Yang, Jyh-Shing R. Jang
Can we make a famous rap singer like Eminem sing whatever our favorite song?
1 code implementation • 11 Aug 2021 • Chin-Jui Chang, Chun-Yi Lee, Yi-Hsuan Yang
This paper proposes a new self-attention based model for music score infilling, i. e., to generate a polyphonic music sequence that fills in the gap between given past and future contexts.
1 code implementation • 24 Aug 2020 • Taejun Kim, Minsuk Choi, Evan Sacks, Yi-Hsuan Yang, Juhan Nam
A DJ mix is a sequence of music tracks concatenated seamlessly, typically rendered for audiences in a live setting by a DJ on stage.
1 code implementation • 4 Dec 2018 • Szu-Yu Chou, Kai-Hsiang Cheng, Jyh-Shing Roger Jang, Yi-Hsuan Yang
In this paper, we introduce a novel attentional similarity module for the problem of few-shot sound recognition.
Sound Audio and Speech Processing
1 code implementation • 5 Aug 2020 • Bo-Yu Chen, Jordan B. L. Smith, Yi-Hsuan Yang
Music producers who use loops may have access to thousands in loop libraries, but finding ones that are compatible is a time-consuming process; we hope to reduce this burden with automation.
1 code implementation • 12 Oct 2022 • Yueh-Kao Wu, Ching-Yu Chiu, Yi-Hsuan Yang
Instead of generating the drum track directly as waveforms, we use a separate VQ-VAE to encode the mel-spectrogram of a drum track into another set of discrete codes, and train the Transformer to predict the sequence of drum-related discrete codes.
1 code implementation • 26 Dec 2019 • Jen-Yu Liu, Yu-Hua Chen, Yin-Cheng Yeh, Yi-Hsuan Yang
Generative models for singing voice have been mostly concerned with the task of ``singing voice synthesis,'' i. e., to produce singing voice waveforms given musical scores and text lyrics.
1 code implementation • 10 Oct 2018 • Hao-Wen Dong, Yi-Hsuan Yang
We propose the BinaryGAN, a novel generative adversarial network (GAN) that uses binary neurons at the output layer of the generator.
1 code implementation • 30 May 2019 • Yun-Ning Hung, I-Tung Chiang, Yi-An Chen, Yi-Hsuan Yang
We investigate disentanglement techniques such as adversarial training to separate latent factors that are related to the musical content (pitch) of different parts of the piece, and that are related to the instrumentation (timbre) of the parts per short-time segment.
Audio and Speech Processing Sound
1 code implementation • 17 Sep 2022 • Shih-Lun Wu, Yi-Hsuan Yang
Even with strong sequence models like Transformers, generating expressive piano performances with long-range musical structures remains challenging.
1 code implementation • 11 Feb 2019 • Vibert Thio, Hao-Min Liu, Yin-Cheng Yeh, Yi-Hsuan Yang
New machine learning algorithms are being developed to solve problems in different areas, including music.
Human-Computer Interaction D.2.2; H.5.2; H.5.5
1 code implementation • 26 Aug 2020 • Antonio Ramires, Frederic Font, Dmitry Bogdanov, Jordan B. L. Smith, Yi-Hsuan Yang, Joann Ching, Bo-Yu Chen, Yueh-Kao Wu, Hsu Wei-Han, Xavier Serra
We present the Freesound Loop Dataset (FSLD), a new large-scale dataset of music loops annotated by experts.
Audio and Speech Processing Sound
2 code implementations • 17 Oct 2021 • Wei-Han Hsu, Bo-Yu Chen, Yi-Hsuan Yang
Along with the evolution of music technology, a large number of styles, or "subgenres," of Electronic Dance Music(EDM) have emerged in recent years.
1 code implementation • 17 Feb 2020 • Tsung-Han Hsieh, Kai-Hsiang Cheng, Zhe-Cheng Fan, Yu-Ching Yang, Yi-Hsuan Yang
A singer identification model may learn to extract non-vocal related features from the instrumental part of the songs, if a singer only sings in certain musical contexts (e. g., genres).
1 code implementation • 16 Jun 2021 • Ching-Yu Chiu, Joann Ching, Wen-Yi Hsiao, Yu-Hua Chen, Alvin Wen-Yu Su, Yi-Hsuan Yang
Due to advances in deep learning, the performance of automatic beat and downbeat tracking in musical audio signals has seen great improvement in recent years.
1 code implementation • 1 Nov 2021 • Joann Ching, Yi-Hsuan Yang
Recent years have witnessed a growing interest in research related to the detection of piano pedals from audio signals in the music information retrieval community.
1 code implementation • 6 Aug 2020 • Ching-Yu Chiu, Wen-Yi Hsiao, Yin-Cheng Yeh, Yi-Hsuan Yang, Alvin Wen-Yu Su
Blind music source separation has been a popular and active subject of research in both the music information retrieval and signal processing communities.
1 code implementation • 5 May 2018 • Jen-Yu Liu, Yi-Hsuan Yang, Shyh-Kang Jeng
Instrument playing is among the most common scenes in music-related videos, which represent nowadays one of the largest sources of online videos.
1 code implementation • 16 Jun 2021 • Ching-Yu Chiu, Alvin Wen-Yu Su, Yi-Hsuan Yang
This paper presents a novel system architecture that integrates blind source separation with joint beat and downbeat tracking in musical audio signals.
1 code implementation • 20 Aug 2023 • Ching-Yu Chiu, Meinard Müller, Matthew E. P. Davies, Alvin Wen-Yu Su, Yi-Hsuan Yang
To model the periodicity of beats, state-of-the-art beat tracking systems use "post-processing trackers" (PPTs) that rely on several empirically determined global assumptions for tempo transition, which work well for music with a steady tempo.
1 code implementation • 6 Oct 2022 • Chih-Pin Tan, Alvin W. Y. Su, Yi-Hsuan Yang
This paper proposes a novel Transformer-based model for music score infilling, to generate a music passage that fills in the gap between given past and future contexts.
2 code implementations • 30 Oct 2017 • Lang-Chi Yu, Yi-Hsuan Yang, Yun-Ning Hung, Yi-An Chen
A model for hit song prediction can be used in the pop music industry to identify emerging trends and potential artists or songs before they are marketed to the public.
1 code implementation • 13 Oct 2022 • Ching-Yu Chiu, Meinard Müller, Matthew E. P. Davies, Alvin Wen-Yu Su, Yi-Hsuan Yang
For expressive music, the tempo may change over time, posing challenges to tracking the beats by an automatic model.
1 code implementation • 26 Aug 2019 • Hsiao-Tzu Hung, Chung-Yang Wang, Yi-Hsuan Yang, Hsin-Min Wang
In this paper, we tackle the problem of transfer learning for Jazz automatic generation.
no code implementations • 24 May 2018 • Zhe-Cheng Fan, Tak-Shing T. Chan, Yi-Hsuan Yang, Jyh-Shing R. Jang
Vector-valued neural learning has emerged as a promising direction in deep learning recently.
no code implementations • 9 Jan 2018 • Tak-Shing T. Chan, Yi-Hsuan Yang
Singing voice separation attempts to separate the vocal and instrumental parts of a music recording, which is a fundamental problem in music information retrieval.
no code implementations • 9 Jan 2018 • Tak-Shing T. Chan, Yi-Hsuan Yang
Informed by recent work on tensor singular value decomposition and circulant algebra matrices, this paper presents a new theoretical bridge that unifies the hypercomplex and tensor-based approaches to singular value decomposition and robust principal component analysis.
no code implementations • 9 Jan 2018 • Tak-Shing T. Chan, Yi-Hsuan Yang
Thus, in this letter, we extend principal component pursuit to the complex and quaternionic cases to account for the missing phase information.
no code implementations • 13 Sep 2017 • Yu-Siang Huang, Szu-Yu Chou, Yi-Hsuan Yang
Generating music medleys is about finding an optimal permutation of a given set of music clips.
no code implementations • 5 Apr 2017 • Li-Chia Yang, Szu-Yu Chou, Jen-Yu Liu, Yi-Hsuan Yang, Yi-An Chen
Being able to predict whether a song can be a hit has impor- tant applications in the music industry.
no code implementations • 26 Aug 2016 • Jen-Yu Liu, Shyh-Kang Jeng, Yi-Hsuan Yang
Recent years have witnessed an increased interest in the application of persistent homology, a topological tool for data analysis, to machine learning problems.
no code implementations • 24 Jun 2016 • Kai-Chun Hsu, Szu-Yu Chou, Yi-Hsuan Yang, Tai-Shih Chi
The utilization of sequential patterns has boosted performance on several kinds of recommendation tasks.
no code implementations • 1 Nov 2017 • Chih-Ming Chen, Yi-Hsuan Yang, Yi-An Chen, Ming-Feng Tsai
Many existing methods adopt a uniform sampling method to reduce learning complexity, but when the network is non-uniform (i. e. a weighted network) such uniform sampling incurs information loss.
no code implementations • 5 Jul 2018 • Jen-Yu Liu, Yi-Hsuan Yang
In this work, we propose a denoising Auto-encoder with Recurrent skip Connections (ARC).
no code implementations • 4 Jun 2019 • Jen-Yu Liu, Yi-Hsuan Yang
To reach information at remote locations, we propose to combine dilated convolution with a modified version of gated recurrent units (GRU) called the `Dilated GRU' to form a block.
no code implementations • 8 Jan 2020 • Yin-Cheng Yeh, Wen-Yi Hsiao, Satoru Fukayama, Tetsuro Kitahara, Benjamin Genchel, Hao-Min Liu, Hao-Wen Dong, Yi-An Chen, Terence Leong, Yi-Hsuan Yang
Several prior works have proposed various methods for the task of automatic melody harmonization, in which a model aims to generate a sequence of chords to serve as the harmonic accompaniment of a given multiple-bar melody sequence.
no code implementations • 20 Feb 2020 • Jianyu Fan, Yi-Hsuan Yang, Kui Dong, Philippe Pasquier
In this study, we examine whether we can analyze and compare Western and Chinese classical music based on soundscape models.
no code implementations • 28 May 2020 • Da-Yi Wu, Yi-Hsuan Yang
Specifically, given a speech input, and optionally the F0 contour of the target singing, the proposed model generates as the output a singing signal with a progressive-growing encoder/decoder architecture and boundary equilibrium GAN loss functions.
no code implementations • 4 Aug 2020 • Yu-Hua Chen, Yu-Hsiang Huang, Wen-Yi Hsiao, Yi-Hsuan Yang
Deep learning algorithms are increasingly developed for learning to compose music in the form of MIDI files.
Sound Audio and Speech Processing
no code implementations • 8 Oct 2021 • Chien-Feng Liao, Jen-Yu Liu, Yi-Hsuan Yang
In this paper, we propose a novel neural network model called KaraSinger for a less-studied singing voice synthesis (SVS) task named score-free SVS, in which the prosody and melody are spontaneously decided by machine.
no code implementations • 13 Oct 2021 • Bo-Yu Chen, Wei-Han Hsu, Wei-Hsiang Liao, Marco A. Martínez Ramírez, Yuki Mitsufuji, Yi-Hsuan Yang
A central task of a Disc Jockey (DJ) is to create a mixset of mu-sic with seamless transitions between adjacent tracks.
no code implementations • 11 Nov 2021 • Chih-Pin Tan, Chin-Jui Chang, Alvin W. Y. Su, Yi-Hsuan Yang
In this paper, we investigate using the variable-length infilling (VLI) model, which is originally proposed to infill missing segments, to "prolong" existing musical segments at musical boundaries.