speaker-diarization
81 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in speaker-diarization
Libraries
Use these libraries to find speaker-diarization models and implementationsMost implemented papers
AVA-AVD: Audio-Visual Speaker Diarization in the Wild
Audio-visual speaker diarization aims at detecting "who spoke when" using both auditory and visual signals.
Speaker Diarization with LSTM
For many years, i-vector based audio embedding techniques were the dominant approach for speaker verification and speaker diarization applications.
pyannote.audio: neural building blocks for speaker diarization
We introduce pyannote. audio, an open-source toolkit written in Python for speaker diarization.
Speech Recognition and Multi-Speaker Diarization of Long Conversations
Speech recognition (ASR) and speaker diarization (SD) models have traditionally been trained separately to produce rich conversation transcripts with speaker labels.
End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors
End-to-end speaker diarization for an unknown number of speakers is addressed in this paper.
The Third DIHARD Diarization Challenge
DIHARD III was the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variability in recording equipment, noise conditions, and conversational domain.
Speech Emotion Diarization: Which Emotion Appears When?
Speech Emotion Recognition (SER) typically relies on utterance-level solutions.
End-to-End Neural Speaker Diarization with Self-attention
Our method was even better than that of the state-of-the-art x-vector clustering-based method.
VoxLingua107: a Dataset for Spoken Language Recognition
Speech activity detection and speaker diarization are used to extract segments from the videos that contain speech.
A Comprehensive Evaluation of Incremental Speech Recognition and Diarization for Conversational AI
Automatic Speech Recognition (ASR) systems are increasingly powerful and more accurate, but also more numerous with several options existing currently as a service (e. g. Google, IBM, and Microsoft).