no code implementations • 20 Jul 2023 • Timo I. Denk, Yu Takagi, Takuya Matsuyama, Andrea Agostinelli, Tomoya Nakai, Christian Frank, Shinji Nishimoto
The process of reconstructing experiences from human brain activity offers a unique lens into how the brain interprets and represents the world.
no code implementations • 11 May 2023 • Kun Su, Judith Yue Li, Qingqing Huang, Dima Kuzmin, Joonseok Lee, Chris Donahue, Fei Sha, Aren Jansen, Yu Wang, Mauro Verzetti, Timo I. Denk
Video-to-music generation demands both a temporally localized high-quality listening experience and globally aligned video-acoustic signatures.
no code implementations • 8 Feb 2023 • Qingqing Huang, Daniel S. Park, Tao Wang, Timo I. Denk, Andy Ly, Nanxin Chen, Zhengdong Zhang, Zhishuai Zhang, Jiahui Yu, Christian Frank, Jesse Engel, Quoc V. Le, William Chan, Zhifeng Chen, Wei Han
We introduce Noise2Music, where a series of diffusion models is trained to generate high-quality 30-second music clips from text prompts.
Ranked #2 on Text-to-Music Generation on MusicCaps
3 code implementations • 26 Jan 2023 • Andrea Agostinelli, Timo I. Denk, Zalán Borsos, Jesse Engel, Mauro Verzetti, Antoine Caillon, Qingqing Huang, Aren Jansen, Adam Roberts, Marco Tagliasacchi, Matt Sharifi, Neil Zeghidour, Christian Frank
We introduce MusicLM, a model generating high-fidelity music from text descriptions such as "a calming violin melody backed by a distorted guitar riff".
Ranked #8 on Text-to-Music Generation on MusicCaps
no code implementations • COLING (TextGraphs) 2020 • Timo I. Denk, Ana Peleteiro Ramallo
BERT is a popular language model whose main pre-training task is to fill in the blank, i. e., predicting a word that was masked out of a sentence, based on the remaining words.
2 code implementations • NeurIPS Workshop Document_Intelligen 2019 • Timo I. Denk, Christian Reisswig
For understanding generic documents, information like font sizes, column layout, and generally the positioning of words may carry semantic information that is crucial for solving a downstream document intelligence task.