1 code implementation • 29 Nov 2024 • Julian D Parker, Anton Smirnov, Jordi Pons, CJ Carr, Zack Zukowski, Zach Evans, Xubo Liu
The tokenization of speech with neural audio codec models is a vital part of modern AI pipelines for the generation or understanding of speech, alone or in a multimodal context.
1 code implementation • 19 Jul 2024 • Zach Evans, Julian D. Parker, CJ Carr, Zack Zukowski, Josiah Taylor, Jordi Pons
Open generative models are vitally important for the community, allowing for fine-tunes and serving as baselines when presenting new models.
Ranked #3 on
Audio Generation
on AudioCaps
1 code implementation • 16 Apr 2024 • Zach Evans, Julian D. Parker, CJ Carr, Zack Zukowski, Josiah Taylor, Jordi Pons
Audio-based generative models for music have seen great strides recently, but so far have not managed to produce full-length music tracks with coherent musical structure from text prompts.
Ranked #7 on
Audio Generation
on AudioCaps
2 code implementations • 7 Feb 2024 • Zach Evans, CJ Carr, Josiah Taylor, Scott H. Hawley, Jordi Pons
Generating long-form 44. 1kHz stereo audio from text prompts can be computationally demanding.
Ranked #1 on
Text-to-Music Generation
on MusicCaps
(KL_passt metric)
no code implementations • 11 Jul 2023 • Jackson Loth, Pedro Sarmento, CJ Carr, Zack Zukowski, Mathieu Barthet
Recent work in the field of symbolic music generation has shown value in using a tokenization based on the GuitarPro format, a symbolic representation supporting guitar expressive attributes, as an input and output representation.
no code implementations • 10 Feb 2023 • Pedro Sarmento, Adarsh Kumar, Yu-Hua Chen, CJ Carr, Zack Zukowski, Mathieu Barthet
We trained a BERT model for downstream genre classification and used it to assess the results obtained with the genre-CTRL model.
1 code implementation • 30 Jul 2021 • Pedro Sarmento, Adarsh Kumar, CJ Carr, Zack Zukowski, Mathieu Barthet, Yi-Hsuan Yang
In this work, we present DadaGP, a new symbolic music dataset comprising 26, 181 song scores in the GuitarPro format covering 739 musical genres, along with an accompanying tokenized format well-suited for generative sequence models such as the Transformer.
no code implementations • 16 Nov 2018 • CJ Carr, Zack Zukowski
This early example of neural synthesis is a proof-of-concept for how machine learning can drive new types of music software.
Sound Audio and Speech Processing