Spoken language identification

11 papers with code • 12 benchmarks • 3 datasets

Identify the language being spoken from an audio input only.

Benchmarks

Add a Result

These leaderboards are used to track progress in Spoken language identification

Dataset	Best Model	Compare
LRE07	CNN-LDE	See all
YouTube News dataset (No Noise)	CRNN	See all
YouTube News dataset (White Noise)	CRNN	See all
VoxForge European	2D ConvNet(MixUp=YES)	See all
Untranscribed mixed-speech dataset	SVM	See all
VoxForge Commonwealth	2D ConvNet(MixUp=YES)	See all
IndicTTS	CRNN	See all
VoxForge	LEAF	See all
KALAKA-3	Model on the automatically filtered (cleaned) data	See all
VOXLINGUA107	Noisy	See all
YouTube News dataset (Crackling Noise)	Inception-v3 CRNN	See all
YouTube News dataset (Background Music)	Inception-v3 CRNN	See all

Show all 12 benchmarks

Collapse benchmarks

Datasets

Most implemented papers

Most implemented Social Latest No code

VoxLingua107: a Dataset for Spoken Language Recognition

alumae/torch-xvectors-wav • • 25 Nov 2020

Speech activity detection and speaker diarization are used to extract segments from the videos that contain speech.

Paper
Code

Automatic Dialect Detection in Arabic Broadcast Speech

Qatar-Computing-Research-Institute/dialectID • 23 Sep 2015

We used these features in a binary classifier to discriminate between Modern Standard Arabic (MSA) and Dialectal Arabic, with an accuracy of 100%.

Paper
Code

Language Identification Using Deep Convolutional Recurrent Neural Networks

HPI-DeepLearning/crnn-lid • • 16 Aug 2017

Language Identification (LID) systems are used to classify the spoken language from a given audio sample and are typically the first step for many spoken language processing tasks, such as Automatic Speech Recognition (ASR) systems.

Paper
Code

Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic Languages

uds-lsv/da-lang-id • • 2 Aug 2020

State-of-the-art spoken language identification (LID) systems, which are based on end-to-end deep neural networks, have shown remarkable success not only in discriminating between distant languages but also between closely-related languages or even different spoken varieties of the same language.

Paper
Code

Triplet Entropy Loss: Improving The Generalisation of Short Speech Language Identification Systems

ruanvdmerwe/triplet-entropy-loss • • 3 Dec 2020

Even though the models trained using Triplet Entropy Loss showed a better understanding of the languages and higher accuracies, it appears as though the models still memorise word patterns present in the spectrograms rather than learning the finer nuances of a language.

Paper
Code

BERT-LID: Leveraging BERT to Improve Spoken Language Identification

thusatlab/bert-lid • • 1 Mar 2022

It has a profound impact on the multilingual interoperability of an intelligent speech system.

Paper
Code

EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use

cpjku/efficientleaf • • 12 Jul 2022

In audio classification, differentiable auditory filterbanks with few parameters cover the middle ground between hard-coded spectrograms and raw audio.

Paper
Code

Distilled Non-Semantic Speech Embeddings with Binary Neural Networks for Low-Resource Devices

HarlinLee/brillsson • • 12 Jul 2022

This work introduces BRILLsson, a novel binary neural network-based representation learning model for a broad range of non-semantic speech tasks.

Paper
Code

Improving Spoken Language Identification with Map-Mix

skit-ai/map-mix • 16 Feb 2023

The pre-trained multi-lingual XLSR model generalizes well for language identification after fine-tuning on unseen languages.

Paper
Code

Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech

shashikg/lid-code-switching • • 1 Jun 2023

This work focuses on improving the Spoken Language Identification (LangId) system for a challenge that focuses on developing robust language identification systems that are reliable for non-standard, accented (Singaporean accent), spontaneous code-switched, and child-directed speech collected via Zoom.

Paper
Code

Spoken language identification

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result