no code implementations • 11 May 2023 • Kun Su, Judith Yue Li, Qingqing Huang, Dima Kuzmin, Joonseok Lee, Chris Donahue, Fei Sha, Aren Jansen, Yu Wang, Mauro Verzetti, Timo I. Denk
Video-to-music generation demands both a temporally localized high-quality listening experience and globally aligned video-acoustic signatures.
no code implementations • 8 May 2023 • Naveen Ram, Dima Kuzmin, Ellie Ka In Chio, Moustafa Farid Alzantot, Santiago Ontanon, Ambarish Jash, Judith Yue Li
In this paper, we analyze the performance of a multitask end-to-end transformer model on the task of conversational recommendations, which aim to provide recommendations based on a user's explicit preferences expressed in dialogue.
no code implementations • 9 Jan 2023 • Judith Yue Li, Aren Jansen, Qingqing Huang, Joonseok Lee, Ravi Ganti, Dima Kuzmin
Multimodal learning can benefit from the representation power of pretrained Large Language Models (LLMs).
1 code implementation • 26 Aug 2022 • Qingqing Huang, Aren Jansen, Joonseok Lee, Ravi Ganti, Judith Yue Li, Daniel P. W. Ellis
Music tagging and content-based retrieval systems have traditionally been constructed using pre-defined ontologies covering a rigid set of music attributes or text queries.