Search Results for author: Wei-Hsiang Liao

Found 16 papers, 7 papers with code

Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning

1 code implementation28 May 2024 Yixiao Zhang, Yukara Ikemiya, Woosung Choi, Naoki Murata, Marco A. Martínez-Ramírez, Liwei Lin, Gus Xia, Wei-Hsiang Liao, Yuki Mitsufuji, Simon Dixon

Recent advances in text-to-music editing, which employ text queries to modify music (e. g.\ by changing its style or adjusting instrumental components), present unique challenges and opportunities for AI-assisted music creation.

MR-MT3: Memory Retaining Multi-Track Music Transcription to Mitigate Instrument Leakage

1 code implementation15 Mar 2024 Hao Hao Tan, Kin Wai Cheuk, Taemin Cho, Wei-Hsiang Liao, Yuki Mitsufuji

This paper presents enhancements to the MT3 model, a state-of-the-art (SOTA) token-based multi-instrument automatic music transcription (AMT) model.

Music Transcription

MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models

1 code implementation9 Feb 2024 Yixiao Zhang, Yukara Ikemiya, Gus Xia, Naoki Murata, Marco A. Martínez-Ramírez, Wei-Hsiang Liao, Yuki Mitsufuji, Simon Dixon

This paper introduces a novel approach to the editing of music generated by such models, enabling the modification of specific attributes, such as genre, mood and instrument, while maintaining other aspects unchanged.

Music Generation Text-to-Music Generation

Manifold Preserving Guided Diffusion

no code implementations28 Nov 2023 Yutong He, Naoki Murata, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Dongjun Kim, Wei-Hsiang Liao, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov, Stefano Ermon

Despite the recent advancements, conditional image generation still faces challenges of cost, generalizability, and the need for task-specific training.

Conditional Image Generation

Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion

1 code implementation1 Oct 2023 Dongjun Kim, Chieh-Hsin Lai, Wei-Hsiang Liao, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Yutong He, Yuki Mitsufuji, Stefano Ermon

Consistency Models (CM) (Song et al., 2023) accelerate score-based diffusion model sampling at the cost of sample quality but lack a natural way to trade-off quality for speed.

Denoising Image Generation

Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription

no code implementations27 Sep 2023 Frank Cwitkowitz, Kin Wai Cheuk, Woosung Choi, Marco A. Martínez-Ramírez, Keisuke Toyama, Wei-Hsiang Liao, Yuki Mitsufuji

Several works have explored multi-instrument transcription as a means to bolster the performance of models on low-resource tasks, but these methods face the same data availability issues.

Music Transcription

Automatic Piano Transcription with Hierarchical Frequency-Time Transformer

1 code implementation10 Jul 2023 Keisuke Toyama, Taketo Akama, Yukara Ikemiya, Yuhta Takida, Wei-Hsiang Liao, Yuki Mitsufuji

This is especially helpful when determining the precise onset and offset for each note in the polyphonic piano content.

Decoder Music Transcription

Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects

1 code implementation4 Nov 2022 Junghyun Koo, Marco A. Martínez-Ramírez, Wei-Hsiang Liao, Stefan Uhlich, Kyogu Lee, Yuki Mitsufuji

We propose an end-to-end music mixing style transfer system that converts the mixing style of an input multitrack to that of a reference song.

Contrastive Learning Disentanglement +2

Automatic music mixing with deep learning and out-of-domain data

1 code implementation24 Aug 2022 Marco A. Martínez-Ramírez, Wei-Hsiang Liao, Giorgio Fabbro, Stefan Uhlich, Chihiro Nagashima, Yuki Mitsufuji

Music mixing traditionally involves recording instruments in the form of clean, individual tracks and blending them into a final mixture using audio effects and expert knowledge (e. g., a mixing engineer).

Preventing Oversmoothing in VAE via Generalized Variance Parameterization

no code implementations17 Feb 2021 Yuhta Takida, Wei-Hsiang Liao, Chieh-Hsin Lai, Toshimitsu Uesaka, Shusuke Takahashi, Yuki Mitsufuji

Variational autoencoders (VAEs) often suffer from posterior collapse, which is a phenomenon in which the learned latent space becomes uninformative.

Decoder

AR-ELBO: Preventing Posterior Collapse Induced by Oversmoothing in Gaussian VAE

no code implementations1 Jan 2021 Yuhta Takida, Wei-Hsiang Liao, Toshimitsu Uesaka, Shusuke Takahashi, Yuki Mitsufuji

Variational autoencoders (VAEs) often suffer from posterior collapse, which is a phenomenon that the learned latent space becomes uninformative.

Cannot find the paper you are looking for? You can Submit a new open access paper.