Search Results for author: Darius Petermann

Found 8 papers, 1 papers with code

Hyperbolic Distance-Based Speech Separation

no code implementations7 Jan 2024 Darius Petermann, Minje Kim

In this work, we explore the task of hierarchical distance-based speech separation defined on a hyperbolic manifold.

Speech Separation

Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks

no code implementations14 Dec 2022 Darius Petermann, Gordon Wichern, Aswin Shanmugam Subramanian, Zhong-Qiu Wang, Jonathan Le Roux

In this paper, we focus on the cocktail fork problem, which takes a three-pronged approach to source separation by separating an audio mixture such as a movie soundtrack or podcast into the three broad categories of speech, music, and sound effects (SFX - understood to include ambient noise and natural sound events).

Action Detection Activity Detection +4

Hyperbolic Audio Source Separation

no code implementations9 Dec 2022 Darius Petermann, Gordon Wichern, Aswin Subramanian, Jonathan Le Roux

We introduce a framework for audio source separation using embeddings on a hyperbolic manifold that compactly represent the hierarchical relationship between sound sources and time-frequency features.

Audio Source Separation

SpaIn-Net: Spatially-Informed Stereophonic Music Source Separation

no code implementations15 Feb 2022 Darius Petermann, Minje Kim

With the recent advancements of data driven approaches using deep neural networks, music source separation has been formulated as an instrument-specific supervised problem.

Disentanglement Music Source Separation

The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks

3 code implementations19 Oct 2021 Darius Petermann, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux

The cocktail party problem aims at isolating any source of interest within a complex acoustic scene, and has long inspired audio source separation research.

Audio Source Separation

HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable Neural Audio Coding

no code implementations22 Jul 2021 Darius Petermann, SeungKwon Beack, Minje Kim

The assumption is that, in a mirrored autoencoder topology, a decoder layer reconstructs the intermediate feature representation of its corresponding encoder layer.

Quantization

Deep Learning Based Source Separation Applied To Choir Ensembles

no code implementations17 Aug 2020 Darius Petermann, Pritish Chandna, Helena Cuesta, Jordi Bonada, Emilia Gomez

However, most of the research has been focused on a typical case which consists in separating vocal, percussion and bass sources from a mixture, each of which has a distinct spectral structure.

Cannot find the paper you are looking for? You can Submit a new open access paper.