Search Results for author: Hervé Bredin

Found 15 papers, 12 papers with code

TristouNet: Triplet Loss for Speaker Turn Embedding

6 code implementations • 14 Sep 2016 • Hervé Bredin

TristouNet is a neural network architecture based on Long Short-Term Memory recurrent networks, meant to project speech sequences into a fixed-dimensional euclidean space.

Change Detection

5,013

Paper
Code

LSTM based Similarity Measurement with Spectral Clustering for Speaker Diarization

1 code implementation • 23 Jul 2019 • Qingjian Lin, Ruiqing Yin, Ming Li, Hervé Bredin, Claude Barras

More and more neural network approaches have achieved considerable improvement upon submodules of speaker diarization system, including speaker change detection and segment-wise speaker embedding extraction.

Change Detection Clustering +2

Paper
Code

End-to-end Domain-Adversarial Voice Activity Detection

1 code implementation • 23 Oct 2019 • Marvin Lavechin, Marie-Philippe Gill, Ruben Bousbib, Hervé Bredin, Leibny Paola Garcia-Perera

In the in-domain scenario where the training and test sets cover the exact same domains, we show that the domain-adversarial approach does not degrade performance of the proposed end-to-end model.

Audio and Speech Processing I.2.7

Paper
Code

pyannote.audio: neural building blocks for speaker diarization

3 code implementations • 4 Nov 2019 • Hervé Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-Philippe Gill

We introduce pyannote. audio, an open-source toolkit written in Python for speaker diarization.

Ranked #1 on Speaker Diarization on ETAPE

Action Detection Activity Detection +4

5,013

Paper
Code

The Speed Submission to DIHARD II: Contributions & Lessons Learned

no code implementations • 6 Nov 2019 • Md Sahidullah, Jose Patino, Samuele Cornell, Ruiqing Yin, Sunit Sivasankaran, Hervé Bredin, Pavel Korshunov, Alessio Brutti, Romain Serizel, Emmanuel Vincent, Nicholas Evans, Sébastien Marcel, Stefano Squartini, Claude Barras

This paper describes the speaker diarization systems developed for the Second DIHARD Speech Diarization Challenge (DIHARD II) by the Speed team.

Action Detection Activity Detection +4

Paper
Add Code

A Comparison of Metric Learning Loss Functions for End-To-End Speaker Verification

1 code implementation • 31 Mar 2020 • Juan M. Coria, Hervé Bredin, Sahar Ghannay, Sophie Rosset

Despite the growing popularity of metric learning approaches, very little work has attempted to perform a fair comparison of these techniques for speaker verification.

Metric Learning Speaker Verification

Paper
Code

An open-source voice type classifier for child-centered daylong recordings

1 code implementation • 26 May 2020 • Marvin Lavechin, Ruben Bousbib, Hervé Bredin, Emmanuel Dupoux, Alejandrina Cristia

Spontaneous conversations in real-world settings such as those found in child-centered recordings have been shown to be amongst the most challenging audio files to process.

Language Acquisition Vocal Bursts Type Prediction

Paper
Code

End-to-end speaker segmentation for overlap-aware resegmentation

2 code implementations • 8 Apr 2021 • Hervé Bredin, Antoine Laurent

Experiments on multiple speaker diarization datasets conclude that our model can be used with great success on both voice activity detection and overlapped speech detection.

Action Detection Activity Detection +5

Paper
Code

Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation

1 code implementation • 14 Sep 2021 • Juan M. Coria, Hervé Bredin, Sahar Ghannay, Sophie Rosset

We propose to address online speaker diarization as a combination of incremental clustering and local diarization applied to a rolling buffer updated every 500ms.

Clustering Segmentation +2

791

Paper
Code

Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation

1 code implementation • 24 Oct 2022 • Marvin Lavechin, Marianne Métais, Hadrien Titeux, Alodie Boissonnet, Jade Copet, Morgane Rivière, Elika Bergelson, Alejandrina Cristia, Emmanuel Dupoux, Hervé Bredin

Most automatic speech processing systems register degraded performance when applied to noisy or reverberant speech.

Action Detection Activity Detection +3

108

Paper
Code

BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models

1 code implementation • 2 Jun 2023 • Marvin Lavechin, Yaya Sy, Hadrien Titeux, María Andrea Cruz Blandón, Okko Räsänen, Hervé Bredin, Emmanuel Dupoux, Alejandrina Cristia

Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels.

Benchmarking Language Acquisition

Paper
Code

Powerset multi-class cross entropy loss for neural speaker diarization

1 code implementation • 19 Oct 2023 • Alexis Plaquet, Hervé Bredin

Since its introduction in 2019, the whole end-to-end neural diarization (EEND) line of work has been addressing speaker diarization as a frame-wise multi-label classification problem with permutation-invariant training.

Multi-class Classification Multi-Label Classification +2

Paper
Code

PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings

no code implementations • 4 Mar 2024 • Joonas Kalda, Clément Pagés, Ricard Marxer, Tanel Alumäe, Hervé Bredin

A major drawback of supervised speech separation (SSep) systems is their reliance on synthetic data, leading to poor real-world generalization.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Bazinga! A Dataset for Multi-Party Dialogues Structuring

no code implementations • LREC 2022 • Paul Lerner, Juliette Bergoënd, Camille Guinaudeau, Hervé Bredin, Benjamin Maurice, Sharleyne Lefevre, Martin Bouteiller, Aman Berhe, Léo Galmant, Ruiqing Yin, Claude Barras

With 16 TV and movie series, Bazinga!

Entity Linking Punctuation Restoration +4

Paper
Add Code

Analyzing BERT Cross-lingual Transfer Capabilities in Continual Sequence Labeling

1 code implementation • MMMPIE (COLING) 2022 • Juan Manuel Coria, Mathilde Veron, Sahar Ghannay, Guillaume Bernard, Hervé Bredin, Olivier Galibert, Sophie Rosset

Knowledge transfer between neural language models is a widely used technique that has proven to improve performance in a multitude of natural language tasks, in particular with the recent rise of large pre-trained language models like BERT.

Continual Learning Cross-Lingual Transfer +6

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.