Search Results for author: Slim Essid

Found 25 papers, 13 papers with code

Opinions in Interactions : New Annotations of the SEMAINE Database

no code implementations LREC 2022 Valentin Barriere, Slim Essid, Chloé Clavel

In this paper, we present the process we used in order to collect new annotations of opinions over the multimodal corpus SEMAINE composed of dyadic interactions.

Online speaker diarization of meetings guided by speech separation

1 code implementation30 Jan 2024 Elio Gruttadauria, Mathieu Fontaine, Slim Essid

The results show that our system improves the state-of-the-art on the AMI headset mix, using no oracle information and under full evaluation (no collar and including overlapped speech).

Action Detection Activity Detection +3

Collaborating Foundation Models for Domain Generalized Semantic Segmentation

1 code implementation15 Dec 2023 Yasser Benigmim, Subhankar Roy, Slim Essid, Vicky Kalogeiton, Stéphane Lathuilière

Domain Generalized Semantic Segmentation (DGSS) deals with training a model on a labeled source domain with the aim of generalizing to unseen domains during inference.

Domain Generalization Segmentation +1

Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads

no code implementations28 Aug 2023 Salah Zaiem, Youcef Kemiche, Titouan Parcollet, Slim Essid, Mirco Ravanelli

Self-supervised learning (SSL) leverages large datasets of unlabeled speech to reach impressive performance with reduced amounts of annotated data.

Benchmarking Self-Supervised Learning

SAMbA: Speech enhancement with Asynchronous ad-hoc Microphone Arrays

no code implementations31 Jul 2023 Nicolas Furnon, Romain Serizel, Slim Essid, Irina Illina

Speech enhancement in ad-hoc microphone arrays is often hindered by the asynchronization of the devices composing the microphone array.

Speech Enhancement

Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?

1 code implementation1 Jun 2023 Salah Zaiem, Youcef Kemiche, Titouan Parcollet, Slim Essid, Mirco Ravanelli

Self-supervised learning (SSL) has recently allowed leveraging large datasets of unlabeled speech signals to reach impressive performance on speech tasks using only small amounts of annotated data.

Benchmarking Self-Supervised Learning

Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations

no code implementations1 Jun 2023 Salah Zaiem, Titouan Parcollet, Slim Essid

Self-Supervised Learning (SSL) has allowed leveraging large amounts of unlabeled speech data to improve the performance of speech recognition models even with small annotated datasets.

Data Augmentation Domain Adaptation +3

One-shot Unsupervised Domain Adaptation with Personalized Diffusion Models

1 code implementation31 Mar 2023 Yasser Benigmim, Subhankar Roy, Slim Essid, Vicky Kalogeiton, Stéphane Lathuilière

Departing from the common notion of transferring only the target ``texture'' information, we leverage text-to-image diffusion models (e. g., Stable Diffusion) to generate a synthetic target dataset with photo-realistic images that not only faithfully depict the style of the target domain, but are also characterized by novel scenes in diverse contexts.

Data Augmentation One-shot Unsupervised Domain Adaptation +2

Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning

1 code implementation8 Apr 2022 Salah Zaiem, Titouan Parcollet, Slim Essid

Thus, this work introduces a conditional independance-based method which allows for automatically selecting a suitable distribution on the choice of augmentations and their parametrization from a set of predefined ones, for contrastive self-supervised pre-training.

Contrastive Learning Data Augmentation +1

Pretext Tasks selection for multitask self-supervised speech representation learning

1 code implementation1 Jul 2021 Salah Zaiem, Titouan Parcollet, Slim Essid, Abdel Heba

Through solving pretext tasks, self-supervised learning leverages unlabeled data to extract useful latent representations replacing traditional input features in the downstream task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Attention-based distributed speech enhancement for unconstrained microphone arrays with varying number of nodes

1 code implementation15 Jun 2021 Nicolas Furnon, Romain Serizel, Slim Essid, Irina Illina

Speech enhancement promises higher efficiency in ad-hoc microphone arrays than in constrained microphone arrays thanks to the wide spatial coverage of the devices in the acoustic scene.

Speech Enhancement

Conditional independence for pretext task selection in Self-supervised speech representation learning

1 code implementation15 Apr 2021 Salah Zaiem, Titouan Parcollet, Slim Essid

Through solving pretext tasks, self-supervised learning (SSL) leverages unlabeled data to extract useful latent representations replacing traditional input features in the downstream task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

DNN-based mask estimation for distributed speech enhancement in spatially unconstrained microphone arrays

1 code implementation3 Nov 2020 Nicolas Furnon, Romain Serizel, Irina Illina, Slim Essid

Deep neural network (DNN)-based speech enhancement algorithms in microphone arrays have now proven to be efficient solutions to speech understanding and speech recognition in noisy environments.

Noise Estimation Speech Enhancement +2

Distributed speech separation in spatially unconstrained microphone arrays

1 code implementation2 Nov 2020 Nicolas Furnon, Romain Serizel, Irina Illina, Slim Essid

We propose a distributed algorithm that can process spatial information in a spatially unconstrained microphone array.

Speech Separation

On-the-fly Detection of User Engagement Decrease in Spontaneous Human-Robot Interaction, International Journal of Social Robotics, 2019

no code implementations20 Apr 2020 Atef Ben Youssef, Giovanna Varni, Slim Essid, Chloé Clavel

In this paper, we consider the detection of a decrease of engagement by users spontaneously interacting with a socially assistive robot in a public space.

Human-Computer Interaction Robotics

DNN-Based Distributed Multichannel Mask Estimation for Speech Enhancement in Microphone Arrays

no code implementations13 Feb 2020 Nicolas Furnon, Romain Serizel, Irina Illina, Slim Essid

Multichannel processing is widely used for speech enhancement but several limitations appear when trying to deploy these solutions to the real-world.

Speech Enhancement

From the Token to the Review: A Hierarchical Multimodal approach to Opinion Mining

no code implementations IJCNLP 2019 Alexandre Garcia, Pierre Colombo, Slim Essid, Florence d'Alché-Buc, Chloé Clavel

The task of predicting fine grained user opinion based on spontaneous spoken language is a key problem arising in the development of Computational Agents as well as in the development of social network based opinion miners.

Opinion Mining

A multimodal movie review corpus for fine-grained opinion mining

1 code implementation26 Feb 2019 Alexandre Garcia, Slim Essid, Florence d'Alché-Buc, Chloé Clavel

We introduce specific categories in order to make the annotation of opinions easier for movie reviews.

Opinion Mining

Opinion Dynamics Modeling for Movie Review Transcripts Classification with Hidden Conditional Random Fields

no code implementations20 Jun 2018 Valentin Barriere, Chloé Clavel, Slim Essid

This model allows us to capture the dynamics of the reviewer's opinion in the transcripts of long unsegmented audio reviews that are analyzed by our system.

General Classification

Weakly Supervised Representation Learning for Unsynchronized Audio-Visual Events

no code implementations19 Apr 2018 Sanjeel Parekh, Slim Essid, Alexey Ozerov, Ngoc Q. K. Duong, Patrick Pérez, Gaël Richard

Audio-visual representation learning is an important task from the perspective of designing machines with the ability to understand complex events.

Multiple Instance Learning Representation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.