no code implementations • 14 Jun 2023 • John Thickstun, David Hall, Chris Donahue, Percy Liang
We achieve this by interleaving sequences of events and controls, such that controls appear following stopping times in the event sequence.
no code implementations • 11 May 2023 • Kun Su, Judith Yue Li, Qingqing Huang, Dima Kuzmin, Joonseok Lee, Chris Donahue, Fei Sha, Aren Jansen, Yu Wang, Mauro Verzetti, Timo I. Denk
Generating high quality music that complements the visual content of a video is a challenging task.
no code implementations • 30 Jan 2023 • Chris Donahue, Antoine Caillon, Adam Roberts, Ethan Manilow, Philippe Esling, Andrea Agostinelli, Mauro Verzetti, Ian Simon, Olivier Pietquin, Neil Zeghidour, Jesse Engel
We present SingSong, a system that generates instrumental music to accompany input vocals, potentially offering musicians and non-musicians alike an intuitive new way to create music featuring their own voice.
1 code implementation • 4 Dec 2022 • Chris Donahue, John Thickstun, Percy Liang
The combination of generative pre-training and a new dataset for this task results in $77$% stronger performance on melody transcription relative to the strongest available baseline.
5 code implementations • 20 Feb 2022 • Karan Goel, Albert Gu, Chris Donahue, Christopher Ré
SaShiMi yields state-of-the-art performance for unconditional waveform generation in the autoregressive setting.
3 code implementations • 16 Aug 2021 • Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang
AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.
1 code implementation • 13 Jul 2021 • Hao-Wen Dong, Chris Donahue, Taylor Berg-Kirkpatrick, Julian McAuley
In this paper, we aim to further extend this idea and examine the feasibility of automatic instrumentation -- dynamically assigning instruments to notes in solo music during performance.
1 code implementation • 12 Jul 2021 • Rodrigo Castellon, Chris Donahue, Percy Liang
Relative to representations from conventional MIR models which are pre-trained on tagging, we find that using representations from Jukebox as input features yields 30% stronger performance on average across four MIR tasks: tagging, genre classification, emotion recognition, and key detection.
Ranked #1 on
Emotion Recognition
on Emomusic
1 code implementation • NAACL 2021 • Mina Lee, Chris Donahue, Robin Jia, Alexander Iyabor, Percy Liang
We release a new benchmark for lexical substitution, the task of finding appropriate substitutes for a target word in a context.
3 code implementations • ACL 2020 • Chris Donahue, Mina Lee, Percy Liang
We show that this approach, which we call infilling by language modeling, can enable LMs to infill entire sentences effectively on three different domains: short stories, scientific abstracts, and lyrics.
1 code implementation • 10 Jul 2019 • Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian McAuley
We are interested in the task of generating multi-instrumental music scores.
1 code implementation • 16 Apr 2019 • Paarth Neekhara, Chris Donahue, Miller Puckette, Shlomo Dubnov, Julian McAuley
Recent approaches in text-to-speech (TTS) synthesis employ neural network strategies to vocode perceptually-informed spectrogram representations directly into listenable waveforms.
6 code implementations • ICLR 2019 • Jesse Engel, Kumar Krishna Agrawal, Shuo Chen, Ishaan Gulrajani, Chris Donahue, Adam Roberts
Efficient audio synthesis is an inherently difficult machine learning task, as human perception is sensitive to both global structure and fine-scale waveform coherence.
no code implementations • 11 Oct 2018 • Chris Donahue, Ian Simon, Sander Dieleman
We present Piano Genie, an intelligent controller which allows non-musicians to improvise on the piano.
2 code implementations • 12 Jun 2018 • Chris Donahue, Huanru Henry Mao, Julian McAuley
Existing research on music generation focuses on composition, but often ignores the expressive performance characteristics required for plausible renditions of resultant pieces.
20 code implementations • ICLR 2019 • Chris Donahue, Julian McAuley, Miller Puckette
Audio signals are sampled at high temporal resolutions, and learning to synthesize audio requires capturing structure across a range of timescales.
no code implementations • 15 Nov 2017 • Chris Donahue, Bo Li, Rohit Prabhavalkar
We investigate the effectiveness of generative adversarial networks (GANs) for speech enhancement, in the context of improving noise robustness of automatic speech recognition (ASR) systems.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
1 code implementation • ICLR 2018 • Chris Donahue, Zachary C. Lipton, Akshay Balsubramani, Julian McAuley
Corresponding samples from the real dataset consist of two distinct photographs of the same subject.
1 code implementation • ICML 2017 • Chris Donahue, Zachary C. Lipton, Julian McAuley
For the step placement task, we combine recurrent and convolutional neural networks to ingest spectrograms of low-level audio features to predict steps, conditioned on chart difficulty.