Search Results for author: Efthymios Tzinis

Found 22 papers, 13 papers with code

Latent Iterative Refinement for Modular Source Separation

no code implementations22 Nov 2022 Dimitrios Bralios, Efthymios Tzinis, Gordon Wichern, Paris Smaragdis, Jonathan Le Roux

During inference, we can dynamically adjust how many processing blocks and iterations of a specific block an input signal needs using a gating module.

Optimal Condition Training for Target Source Separation

1 code implementation11 Nov 2022 Efthymios Tzinis, Gordon Wichern, Paris Smaragdis, Jonathan Le Roux

Recent research has shown remarkable performance in leveraging multiple extraneous conditional and non-mutually exclusive semantic concepts for sound source separation, allowing the flexibility to extract a given target source based on multiple different queries.

AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation

no code implementations20 Jul 2022 Efthymios Tzinis, Scott Wisdom, Tal Remez, John R. Hershey

We identify several limitations of previous work on audio-visual on-screen sound separation, including the coarse resolution of spatio-temporal attention, poor convergence of the audio separation model, limited variety in training and evaluation data, and failure to account for the trade off between preservation of on-screen sounds and suppression of off-screen sounds.

Heterogeneous Target Speech Separation

no code implementations7 Apr 2022 Efthymios Tzinis, Gordon Wichern, Aswin Subramanian, Paris Smaragdis, Jonathan Le Roux

We introduce a new paradigm for single-channel target source separation where the sources of interest can be distinguished using non-mutually exclusive concepts (e. g., loudness, gender, language, spatial location, etc).

Speech Separation

RemixIT: Continual self-training of speech enhancement models via bootstrapped remixing

1 code implementation17 Feb 2022 Efthymios Tzinis, Yossi Adi, Vamsi Krishna Ithapu, Buye Xu, Paris Smaragdis, Anurag Kumar

RemixIT is based on a continuous self-training scheme in which a pre-trained teacher model on out-of-domain data infers estimated pseudo-target signals for in-domain mixtures.

Speech Enhancement Unsupervised Domain Adaptation

Continual self-training with bootstrapped remixing for speech enhancement

1 code implementation19 Oct 2021 Efthymios Tzinis, Yossi Adi, Vamsi K. Ithapu, Buye Xu, Anurag Kumar

Specifically, a separation teacher model is pre-trained on an out-of-domain dataset and is used to infer estimated target signals for a batch of in-domain mixtures.

Speech Enhancement Unsupervised Domain Adaptation

Improving On-Screen Sound Separation for Open-Domain Videos with Audio-Visual Self-Attention

no code implementations17 Jun 2021 Efthymios Tzinis, Scott Wisdom, Tal Remez, John R. Hershey

We introduce a state-of-the-art audio-visual on-screen sound separation system which is capable of learning to separate sounds and associate them with on-screen objects by looking at in-the-wild videos.

Unsupervised Pre-training

Separate but Together: Unsupervised Federated Learning for Speech Enhancement from Non-IID Data

1 code implementation11 May 2021 Efthymios Tzinis, Jonah Casebeer, Zhepei Wang, Paris Smaragdis

We propose FEDENHANCE, an unsupervised federated learning (FL) approach for speech enhancement and separation with non-IID distributed data across multiple clients.

Federated Learning Speech Enhancement +1

Unsupervised low-rank representations for speech emotion recognition

no code implementations14 Apr 2021 Georgios Paraskevopoulos, Efthymios Tzinis, Nikolaos Ellinas, Theodoros Giannakopoulos, Alexandros Potamianos

We examine the use of linear and non-linear dimensionality reduction algorithms for extracting low-rank feature representations for speech emotion recognition.

Dimensionality Reduction General Classification +1

Compute and memory efficient universal sound source separation

2 code implementations3 Mar 2021 Efthymios Tzinis, Zhepei Wang, Xilin Jiang, Paris Smaragdis

Recent progress in audio source separation lead by deep learning has enabled many neural network models to provide robust solutions to this fundamental estimation problem.

Audio Source Separation Efficient Neural Network +1

Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds

no code implementations ICLR 2021 Efthymios Tzinis, Scott Wisdom, Aren Jansen, Shawn Hershey, Tal Remez, Daniel P. W. Ellis, John R. Hershey

For evaluation and semi-supervised experiments, we collected human labels for presence of on-screen and off-screen sounds on a small subset of clips.

Scene Understanding

Unified Gradient Reweighting for Model Biasing with Applications to Source Separation

1 code implementation25 Oct 2020 Efthymios Tzinis, Dimitrios Bralios, Paris Smaragdis

In this paper, we propose a simple, unified gradient reweighting scheme, with a lightweight modification to bias the learning process of a model and steer it towards a certain distribution of results.

Audio Source Separation

Unsupervised Sound Separation Using Mixture Invariant Training

no code implementations NeurIPS 2020 Scott Wisdom, Efthymios Tzinis, Hakan Erdogan, Ron J. Weiss, Kevin Wilson, John R. Hershey

In such supervised approaches, a model is trained to predict the component sources from synthetic mixtures created by adding up isolated ground-truth sources.

Speech Enhancement Speech Separation +1

Improving Universal Sound Separation Using Sound Classification

no code implementations18 Nov 2019 Efthymios Tzinis, Scott Wisdom, John R. Hershey, Aren Jansen, Daniel P. W. Ellis

Deep learning approaches have recently achieved impressive performance on both audio source separation and sound classification.

Audio Source Separation Classification +2

Two-Step Sound Source Separation: Training on Learned Latent Targets

2 code implementations22 Oct 2019 Efthymios Tzinis, Shrikant Venkataramani, Zhepei Wang, Cem Subakan, Paris Smaragdis

In the first step we learn a transform (and it's inverse) to a latent space where masking-based separation performance using oracles is optimal.

Speech Separation Vocal Bursts Valence Prediction

Continual Learning of New Sound Classes using Generative Replay

no code implementations3 Jun 2019 Zhepei Wang, Cem Subakan, Efthymios Tzinis, Paris Smaragdis, Laurent Charlin

We show that by incrementally refining a classifier with generative replay a generator that is 4% of the size of all previous training data matches the performance of refining the classifier keeping 20% of all previous training data.

Continual Learning Sound Classification

Bootstrapped Coordinate Search for Multidimensional Scaling

1 code implementation4 Feb 2019 Efthymios Tzinis

The backbone element of CSMDS framework is the corresponding probability matrix that correspond to how likely is each corresponding coordinate to be evaluated.

Integrating Recurrence Dynamics for Speech Emotion Recognition

1 code implementation9 Nov 2018 Efthymios Tzinis, Georgios Paraskevopoulos, Christos Baziotis, Alexandros Potamianos

We investigate the performance of features that can capture nonlinear recurrence dynamics embedded in the speech signal for the task of Speech Emotion Recognition (SER).

Emotion Recognition in Conversation Speech Emotion Recognition

Pattern Search Multidimensional Scaling

1 code implementation1 Jun 2018 Georgios Paraskevopoulos, Efthymios Tzinis, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Alexandros Potamianos

We present a novel view of nonlinear manifold learning using derivative-free optimization techniques.

Cannot find the paper you are looking for? You can Submit a new open access paper.