no code implementations • 9 Dec 2022 • Darius Petermann, Gordon Wichern, Aswin Subramanian, Jonathan Le Roux
We introduce a framework for audio source separation using embeddings on a hyperbolic manifold that compactly represent the hierarchical relationship between sound sources and time-frequency features.
no code implementations • 2 Nov 2022 • Zexu Pan, Gordon Wichern, François G. Germain, Aswin Subramanian, Jonathan Le Roux
Speaker diarization is well studied for constrained audios but little explored for challenging in-the-wild videos, which have more speakers, shorter utterances, and inconsistent on-screen speakers.
no code implementations • 7 Apr 2022 • Efthymios Tzinis, Gordon Wichern, Aswin Subramanian, Paris Smaragdis, Jonathan Le Roux
We introduce a new paradigm for single-channel target source separation where the sources of interest can be distinguished using non-mutually exclusive concepts (e. g., loudness, gender, language, spatial location, etc).