We propose to reframe the standard greedy autoregressive decoding of MT with a parallel formulation leveraging Jacobi and Gauss-Seidel fixed-point iteration methods for fast inference.
In this work, we define a diffusion-based generative model capable of both music synthesis and source separation by learning the score of the joint probability density of sources sharing a context.
Autoregressive models have achieved impressive results over a wide range of domains in terms of generation quality and downstream task performance.
Moreover, one of the causes of biodiversity loss is sound pollution; in data obtained from regions with loud anthropic noise, it is hard to separate the artificial from the fish sound manually.
State of the art audio source separation models rely on supervised data-driven approaches, which can be expensive in terms of labeling resources.
Ranked #1 on Music Source Separation on Slakh2100