1 code implementation • 16 Nov 2023 • Ilaria Manco, Benno Weck, Seungheon Doh, Minz Won, Yixiao Zhang, Dmitry Bogdanov, Yusong Wu, Ke Chen, Philip Tovstogan, Emmanouil Benetos, Elio Quinton, György Fazekas, Juhan Nam
We introduce the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality audio-caption pairs, designed for the evaluation of music-and-language models.
1 code implementation • 6 Nov 2023 • Minz Won, Yun-Ning Hung, Duc Le
This paper investigates foundation models tailored for music informatics, a domain currently challenged by the scarcity of labeled data and generalization issues.
no code implementations • 2 Oct 2023 • Yun-Ning Hung, Ju-Chiang Wang, Minz Won, Duc Le
To our knowledge, this is the first attempt to study the effects of scaling up both model and training data for a variety of MIR tasks.
no code implementations • 19 Mar 2023 • Seungheon Doh, Minz Won, Keunwoo Choi, Juhan Nam
We introduce a framework that recommends music based on the emotions of speech.
no code implementations • 1 Feb 2023 • Kin Wai Cheuk, Keunwoo Choi, Qiuqiang Kong, Bochen Li, Minz Won, Ju-Chiang Wang, Yun-Ning Hung, Dorien Herremans
Jointist consists of an instrument recognition module that conditions the other two modules: a transcription module that outputs instrument-specific piano rolls, and a source separation module that utilizes instrument information and transcription results.
3 code implementations • 26 Nov 2022 • Seungheon Doh, Minz Won, Keunwoo Choi, Juhan Nam
This paper introduces effective design choices for text-to-music retrieval systems.
no code implementations • 22 Jun 2022 • Kin Wai Cheuk, Keunwoo Choi, Qiuqiang Kong, Bochen Li, Minz Won, Amy Hung, Ju-Chiang Wang, Dorien Herremans
However, its novelty necessitates a new perspective on how to evaluate such a model.
Ranked #4 on Music Transcription on Slakh2100
1 code implementation • 26 Nov 2021 • Minz Won, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore, Xavier Serra
Content creators often use music to enhance their stories, as it can be a powerful tool to convey emotion.
1 code implementation • 23 Nov 2021 • Minz Won, Janne Spijkervet, Keunwoo Choi
The target audience for this web book is researchers and practitioners who are interested in state-of-the-art music classification research and building real-world applications.
1 code implementation • 30 Oct 2020 • Minz Won, Sergio Oramas, Oriol Nieto, Fabien Gouyon, Xavier Serra
In this paper, we investigate three ideas to successfully introduce multimodal metric learning for tag-based music retrieval: elaborate triplet sampling, acoustic and cultural music information, and domain-specific word embeddings.
1 code implementation • 22 Oct 2020 • Filip Korzeniowski, Oriol Nieto, Matthew McCallum, Minz Won, Sergio Oramas, Erik Schmidt
The mood of a song is a highly relevant feature for exploration and recommendation in large collections of music.
7 code implementations • 1 Jun 2020 • Minz Won, Andres Ferraro, Dmitry Bogdanov, Xavier Serra
Recent advances in deep learning accelerated the development of content-based automatic music tagging systems.
Ranked #1 on Music Auto-Tagging on MagnaTagATune (clean)
Music Auto-Tagging Audio and Speech Processing Sound
no code implementations • 11 Nov 2019 • Minz Won, Sanghyuk Chun, Xavier Serra
Recently, we proposed a self-attention based music tagging model.
Sound Audio and Speech Processing
2 code implementations • 12 Jun 2019 • Minz Won, Sanghyuk Chun, Xavier Serra
In addition, we demonstrate the interpretability of the proposed architecture with a heat map visualization.
Sound Audio and Speech Processing
1 code implementation • 5 May 2018 • Jaehun Kim, Minz Won, Xavier Serra, Cynthia C. S. Liem
The automated recognition of music genres from audio information is a challenging problem, as genre labels are subjective and noisy.