Search Results for author: Sahar Ghannay

Found 24 papers, 7 papers with code

Evaluating the carbon footprint of NLP methods: a survey and analysis of existing tools

no code implementations • EMNLP (sustainlp) 2021 • Nesrine Bannour, Sahar Ghannay, Aurélie Névéol, Anne-Laure Ligozat

Modern Natural Language Processing (NLP) makes intensive use of deep learning methods because of the accuracy they offer for a variety of applications.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

The Spoken Language Understanding MEDIA Benchmark Dataset in the Era of Deep Learning: data updates, training and evaluation tools

no code implementations • LREC 2022 • Gaëlle Laperrière, Valentin Pelloin, Antoine Caubrière, Salima Mdhaffar, Nathalie Camelin, Sahar Ghannay, Bassam Jabaian, Yannick Estève

In this paper, we focus on the French MEDIA SLU dataset, distributed since 2005 and used as a benchmark dataset for a large number of research works.

Intent Detection Spoken Language Understanding

Paper
Add Code

Impact Analysis of the Use of Speech and Language Models Pretrained by Self-Supersivion for Spoken Language Understanding

no code implementations • LREC 2022 • Salima Mdhaffar, Valentin Pelloin, Antoine Caubrière, Gaëlle Laperriere, Sahar Ghannay, Bassam Jabaian, Nathalie Camelin, Yannick Estève

Pretrained models through self-supervised learning have been recently introduced for both acoustic and language modeling.

Language Modelling Self-Supervised Learning +3

Paper
Add Code

Analyzing BERT Cross-lingual Transfer Capabilities in Continual Sequence Labeling

1 code implementation • MMMPIE (COLING) 2022 • Juan Manuel Coria, Mathilde Veron, Sahar Ghannay, Guillaume Bernard, Hervé Bredin, Olivier Galibert, Sophie Rosset

Knowledge transfer between neural language models is a widely used technique that has proven to improve performance in a multitude of natural language tasks, in particular with the recent rise of large pre-trained language models like BERT.

Continual Learning Cross-Lingual Transfer +6

Paper
Code

Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification

no code implementations • 17 Apr 2024 • Pierre Lepagnol, Thomas Gerald, Sahar Ghannay, Christophe Servan, Sophie Rosset

This study is part of the debate on the efficiency of large versus small language models for text classification by prompting. We assess the performance of small language models in zero-shot text classification, challenging the prevailing dominance of large models. Across 15 datasets, our investigation benchmarks language models from 77M to 40B parameters using different architectures and scoring functions.

text-classification Text Classification +2

Paper
Add Code

New Semantic Task for the French Spoken Language Understanding MEDIA Benchmark

1 code implementation • 28 Mar 2024 • Nadège Alavoine, Gaëlle Laperriere, Christophe Servan, Sahar Ghannay, Sophie Rosset

A combination ofmultiple datasets, including the MEDIA dataset, was suggested for training this joint model.

intent-classification Intent Classification +4

Paper
Code

mALBERT: Is a Compact Multilingual BERT Model Still Worth It?

no code implementations • 27 Mar 2024 • Christophe Servan, Sahar Ghannay, Sophie Rosset

Within the current trend of Pretained Language Models (PLM), emerge more and more criticisms about the ethical andecological impact of such models.

Language Modelling Question Answering

Paper
Add Code

Semantic enrichment towards efficient speech representations

no code implementations • 3 Jul 2023 • Gaëlle Laperrière, Ha Nguyen, Sahar Ghannay, Bassam Jabaian, Yannick Estève

Over the past few years, self-supervised learned speech representations have emerged as fruitful replacements for conventional surface representations when solving Spoken Language Understanding (SLU) tasks.

Spoken Language Understanding

Paper
Add Code

Benchmarking Transformers-based models on French Spoken Language Understanding tasks

no code implementations • 19 Jul 2022 • Oralie Cattan, Sahar Ghannay, Christophe Servan, Sophie Rosset

In this paper, we propose a unified benchmark, focused on evaluating models quality and their ecological impact on two well-known French spoken language understanding tasks.

Benchmarking Spoken Language Understanding

Paper
Add Code

Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation

1 code implementation • 14 Sep 2021 • Juan M. Coria, Hervé Bredin, Sahar Ghannay, Sophie Rosset

We propose to address online speaker diarization as a combination of incremental clustering and local diarization applied to a rolling buffer updated every 500ms.

Clustering Segmentation +2

789

Paper
Code

Where are we in semantic concept extraction for Spoken Language Understanding?

no code implementations • 24 Jun 2021 • Sahar Ghannay, Antoine Caubrière, Salima Mdhaffar, Gaëlle Laperrière, Bassam Jabaian, Yannick Estève

More recent works on self-supervised training with unlabeled data open new perspectives in term of performance for automatic speech recognition and natural language processing.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +7

Paper
Add Code

Neural Networks approaches focused on French Spoken Language Understanding: application to the MEDIA Evaluation Task

1 code implementation • COLING 2020 • Sahar Ghannay, Christophe Servan, Sophie Rosset

In this paper, we present a study on a French Spoken Language Understanding (SLU) task: the MEDIA task.

Spoken Language Understanding Word Embeddings

Paper
Code

LIMSI\_UPV at SemEval-2020 Task 9: Recurrent Convolutional Neural Network for Code-mixed Sentiment Analysis

no code implementations • SEMEVAL 2020 • Somnath Banerjee, Sahar Ghannay, Sophie Rosset, Anne Vilnat, Paolo Rosso

This paper describes the participation of LIMSI{\_}UPV team in SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text.

Sentiment Analysis

Paper
Add Code

LIMSI_UPV at SemEval-2020 Task 9: Recurrent Convolutional Neural Network for Code-mixed Sentiment Analysis

1 code implementation • 30 Aug 2020 • Somnath Banerjee, Sahar Ghannay, Sophie Rosset, Anne Vilnat, Paolo Rosso

This paper describes the participation of LIMSI UPV team in SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text.

Sentiment Analysis

Paper
Code

A Metric Learning Approach to Misogyny Categorization

no code implementations • WS 2020 • Juan Manuel Coria, Sahar Ghannay, Sophie Rosset, Herv{\'e} Bredin

The task of automatic misogyny identification and categorization has not received as much attention as other natural language tasks have, even though it is crucial for identifying hate speech in social Internet interactions.

Metric Learning Sentence +2

Paper
Add Code

A Comparison of Metric Learning Loss Functions for End-To-End Speaker Verification

1 code implementation • 31 Mar 2020 • Juan M. Coria, Hervé Bredin, Sahar Ghannay, Sophie Rosset

Despite the growing popularity of metric learning approaches, very little work has attempted to perform a fair comparison of these techniques for speaker verification.

Metric Learning Speaker Verification

Paper
Code

End-to-end named entity extraction from speech

no code implementations • 30 May 2018 • Sahar Ghannay, Antoine Caubrière, Yannick Estève, Antoine Laurent, Emmanuel Morin

Until now, NER from speech is made through a pipeline process that consists in processing first an automatic speech recognition (ASR) on the audio and then processing a NER on the ASR outputs.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation

3 code implementations • 12 May 2018 • François Hernandez, Vincent Nguyen, Sahar Ghannay, Natalia Tomashenko, Yannick Estève

We present the recent development on Automatic Speech Recognition (ASR) systems in comparison with the two previous releases of the TED-LIUM Corpus from 2012 and 2014.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

18,391

Paper
Code

Simulating ASR errors for training SLU systems

no code implementations • LREC 2018 • Edwin Simonnet, Sahar Ghannay, Nathalie Camelin, Yannick Est{\`e}ve

Automatic Speech Recognition (ASR) Data Augmentation +3

Paper
Add Code

ASR error management for improving spoken language understanding

no code implementations • 26 May 2017 • Edwin Simonnet, Sahar Ghannay, Nathalie Camelin, Yannick Estève, Renato de Mori

This paper addresses the problem of automatic speech recognition (ASR) error detection and their use for improving spoken language understanding (SLU) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Evaluation of acoustic word embeddings

no code implementations • WS 2016 • Sahar Ghannay, Yannick Est{\`e}ve, Nathalie Camelin, Paul Deleglise

Speech Recognition Word Embeddings

Paper
Add Code

Utilisation des repr\'esentations continues des mots et des param\`etres prosodiques pour la d\'etection d'erreurs dans les transcriptions automatiques de la parole (Combining continuous word representation and prosodic features for ASR error detection)

no code implementations • JEPTALNRECITAL 2016 • Sahar Ghannay, Yannick Est{\`e}ve, Nathalie Camelin, Camille Dutrey, Fabian Santiago, Martine Adda-Decker

Dans cet article, nous proposons d{'}{\'e}tudier leur utilisation dans une architecture neuronale pour la t{\^a}che de d{\'e}tection des erreurs au sein de transcriptions automatiques de la parole.

Paper
Add Code

Word Embedding Evaluation and Combination

no code implementations • LREC 2016 • Sahar Ghannay, Benoit Favre, Yannick Est{\`e}ve, Nathalie Camelin

Different approaches have been introduced to calculate word embeddings through neural networks.

Word Embeddings

Paper
Add Code

Using Hypothesis Selection Based Features for Confusion Network MT System Combination

no code implementations • WS 2014 • Sahar Ghannay, Lo{\"\i}c Barrault

Language Modelling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.