Search Results for author: Slim Essid

Found 25 papers, 13 papers with code

Opinions in Interactions : New Annotations of the SEMAINE Database

no code implementations • LREC 2022 • Valentin Barriere, Slim Essid, Chloé Clavel

In this paper, we present the process we used in order to collect new annotations of opinions over the multimodal corpus SEMAINE composed of dyadic interactions.

Paper
Add Code

Online speaker diarization of meetings guided by speech separation

1 code implementation • 30 Jan 2024 • Elio Gruttadauria, Mathieu Fontaine, Slim Essid

The results show that our system improves the state-of-the-art on the AMI headset mix, using no oracle information and under full evaluation (no collar and including overlapped speech).

Action Detection Activity Detection +3

Paper
Code

On the choice of the optimal temporal support for audio classification with Pre-trained embeddings

no code implementations • 21 Dec 2023 • Aurian Quelennec, Michel Olvera, Geoffroy Peeters, Slim Essid

Choosing the best one for a set of tasks is the subject of many recent publications.

Audio Classification

Paper
Add Code

Collaborating Foundation Models for Domain Generalized Semantic Segmentation

1 code implementation • 15 Dec 2023 • Yasser Benigmim, Subhankar Roy, Slim Essid, Vicky Kalogeiton, Stéphane Lathuilière

Domain Generalized Semantic Segmentation (DGSS) deals with training a model on a labeled source domain with the aim of generalizing to unseen domains during inference.

Ranked #3 on Domain Generalization on GTA-to-Avg(Cityscapes,BDD,Mapillary)

Domain Generalization Segmentation +1

Paper
Code

Resilient Multiple Choice Learning: A learned scoring scheme with application to audio scene analysis

1 code implementation • NeurIPS 2023 • Victor Letzelter, Mathieu Fontaine, Mickaël Chen, Patrick Pérez, Slim Essid, Gaël Richard

Multiple Choice Learning is a simple framework to tackle multimodal density estimation, using the Winner-Takes-All (WTA) loss for a set of hypotheses.

Density Estimation Multiple-choice +1

Paper
Code

Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads

no code implementations • 28 Aug 2023 • Salah Zaiem, Youcef Kemiche, Titouan Parcollet, Slim Essid, Mirco Ravanelli

Self-supervised learning (SSL) leverages large datasets of unlabeled speech to reach impressive performance with reduced amounts of annotated data.

Benchmarking Self-Supervised Learning

Paper
Add Code

SAMbA: Speech enhancement with Asynchronous ad-hoc Microphone Arrays

no code implementations • 31 Jul 2023 • Nicolas Furnon, Romain Serizel, Slim Essid, Irina Illina

Speech enhancement in ad-hoc microphone arrays is often hindered by the asynchronization of the devices composing the microphone array.

Speech Enhancement

Paper
Add Code

Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?

1 code implementation • 1 Jun 2023 • Salah Zaiem, Youcef Kemiche, Titouan Parcollet, Slim Essid, Mirco Ravanelli

Self-supervised learning (SSL) has recently allowed leveraging large datasets of unlabeled speech signals to reach impressive performance on speech tasks using only small amounts of annotated data.

Benchmarking Self-Supervised Learning

Paper
Code

Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations

no code implementations • 1 Jun 2023 • Salah Zaiem, Titouan Parcollet, Slim Essid

Self-Supervised Learning (SSL) has allowed leveraging large amounts of unlabeled speech data to improve the performance of speech recognition models even with small annotated datasets.

Data Augmentation Domain Adaptation +3

Paper
Add Code

One-shot Unsupervised Domain Adaptation with Personalized Diffusion Models

1 code implementation • 31 Mar 2023 • Yasser Benigmim, Subhankar Roy, Slim Essid, Vicky Kalogeiton, Stéphane Lathuilière

Departing from the common notion of transferring only the target ``texture'' information, we leverage text-to-image diffusion models (e. g., Stable Diffusion) to generate a synthetic target dataset with photo-realistic images that not only faithfully depict the style of the target domain, but are also characterized by novel scenes in diverse contexts.

Ranked #1 on One-shot Unsupervised Domain Adaptation on SYNTHIA-to-Cityscapes

Data Augmentation One-shot Unsupervised Domain Adaptation +2

Paper
Code

Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study

1 code implementation • 12 Mar 2023 • Salah Zaiem, Robin Algayres, Titouan Parcollet, Slim Essid, Mirco Ravanelli

Self-supervised learning (SSL) has allowed substantial progress in Automatic Speech Recognition (ASR) performance in low-resource settings.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning

1 code implementation • 8 Apr 2022 • Salah Zaiem, Titouan Parcollet, Slim Essid

Thus, this work introduces a conditional independance-based method which allows for automatically selecting a suitable distribution on the choice of augmentations and their parametrization from a set of predefined ones, for contrastive self-supervised pre-training.

Contrastive Learning Data Augmentation +1

Paper
Code

Pretext Tasks selection for multitask self-supervised speech representation learning

1 code implementation • 1 Jul 2021 • Salah Zaiem, Titouan Parcollet, Slim Essid, Abdel Heba

Through solving pretext tasks, self-supervised learning leverages unlabeled data to extract useful latent representations replacing traditional input features in the downstream task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Code

Attention-based distributed speech enhancement for unconstrained microphone arrays with varying number of nodes

1 code implementation • 15 Jun 2021 • Nicolas Furnon, Romain Serizel, Slim Essid, Irina Illina

Speech enhancement promises higher efficiency in ad-hoc microphone arrays than in constrained microphone arrays thanks to the wide spatial coverage of the devices in the acoustic scene.

Speech Enhancement

Paper
Code

Conditional independence for pretext task selection in Self-supervised speech representation learning

1 code implementation • 15 Apr 2021 • Salah Zaiem, Titouan Parcollet, Slim Essid

Through solving pretext tasks, self-supervised learning (SSL) leverages unlabeled data to extract useful latent representations replacing traditional input features in the downstream task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Code

DNN-based mask estimation for distributed speech enhancement in spatially unconstrained microphone arrays

1 code implementation • 3 Nov 2020 • Nicolas Furnon, Romain Serizel, Irina Illina, Slim Essid

Deep neural network (DNN)-based speech enhancement algorithms in microphone arrays have now proven to be efficient solutions to speech understanding and speech recognition in noisy environments.

Noise Estimation Speech Enhancement +2

Paper
Code

Distributed speech separation in spatially unconstrained microphone arrays

1 code implementation • 2 Nov 2020 • Nicolas Furnon, Romain Serizel, Irina Illina, Slim Essid

We propose a distributed algorithm that can process spatial information in a spatially unconstrained microphone array.

Speech Separation

Paper
Code

On-the-fly Detection of User Engagement Decrease in Spontaneous Human-Robot Interaction, International Journal of Social Robotics, 2019

no code implementations • 20 Apr 2020 • Atef Ben Youssef, Giovanna Varni, Slim Essid, Chloé Clavel

In this paper, we consider the detection of a decrease of engagement by users spontaneously interacting with a socially assistive robot in a public space.

Human-Computer Interaction Robotics

Paper
Add Code

DNN-Based Distributed Multichannel Mask Estimation for Speech Enhancement in Microphone Arrays

no code implementations • 13 Feb 2020 • Nicolas Furnon, Romain Serizel, Irina Illina, Slim Essid

Multichannel processing is widely used for speech enhancement but several limitations appear when trying to deploy these solutions to the real-world.

Speech Enhancement

Paper
Add Code

From the Token to the Review: A Hierarchical Multimodal approach to Opinion Mining

no code implementations • IJCNLP 2019 • Alexandre Garcia, Pierre Colombo, Slim Essid, Florence d'Alché-Buc, Chloé Clavel

The task of predicting fine grained user opinion based on spontaneous spoken language is a key problem arising in the development of Computational Agents as well as in the development of social network based opinion miners.

Opinion Mining

Paper
Add Code

A multimodal movie review corpus for fine-grained opinion mining

1 code implementation • 26 Feb 2019 • Alexandre Garcia, Slim Essid, Florence d'Alché-Buc, Chloé Clavel

We introduce specific categories in order to make the annotation of opinions easier for movie reviews.

Opinion Mining

Paper
Code

Identify, locate and separate: Audio-visual object extraction in large video collections using weak supervision

no code implementations • 9 Nov 2018 • Sanjeel Parekh, Alexey Ozerov, Slim Essid, Ngoc Duong, Patrick Pérez, Gaël Richard

We tackle the problem of audiovisual scene analysis for weakly-labeled data.

General Classification Multiple Instance Learning +3

Paper
Add Code

Opinion Dynamics Modeling for Movie Review Transcripts Classification with Hidden Conditional Random Fields

no code implementations • 20 Jun 2018 • Valentin Barriere, Chloé Clavel, Slim Essid

This model allows us to capture the dynamics of the reviewer's opinion in the transcripts of long unsegmented audio reviews that are analyzed by our system.

General Classification

Paper
Add Code

Weakly Supervised Representation Learning for Unsynchronized Audio-Visual Events

no code implementations • 19 Apr 2018 • Sanjeel Parekh, Slim Essid, Alexey Ozerov, Ngoc Q. K. Duong, Patrick Pérez, Gaël Richard

Audio-visual representation learning is an important task from the perspective of designing machines with the ability to understand complex events.

Multiple Instance Learning Representation Learning

Paper
Add Code

Structured Output Learning with Abstention: Application to Accurate Opinion Prediction

no code implementations • ICML 2018 • Alexandre Garcia, Slim Essid, Chloé Clavel, Florence d'Alché-Buc

Motivated by Supervised Opinion Analysis, we propose a novel framework devoted to Structured Output Learning with Abstention (SOLA).

Opinion Mining Sentence

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.