Spoken language identification

11 papers with code • 12 benchmarks • 3 datasets

Identify the language being spoken from an audio input only.

Most implemented papers

VoxLingua107: a Dataset for Spoken Language Recognition

alumae/torch-xvectors-wav 25 Nov 2020

Speech activity detection and speaker diarization are used to extract segments from the videos that contain speech.

Automatic Dialect Detection in Arabic Broadcast Speech

Qatar-Computing-Research-Institute/dialectID 23 Sep 2015

We used these features in a binary classifier to discriminate between Modern Standard Arabic (MSA) and Dialectal Arabic, with an accuracy of 100%.

Language Identification Using Deep Convolutional Recurrent Neural Networks

HPI-DeepLearning/crnn-lid 16 Aug 2017

Language Identification (LID) systems are used to classify the spoken language from a given audio sample and are typically the first step for many spoken language processing tasks, such as Automatic Speech Recognition (ASR) systems.

Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic Languages

uds-lsv/da-lang-id 2 Aug 2020

State-of-the-art spoken language identification (LID) systems, which are based on end-to-end deep neural networks, have shown remarkable success not only in discriminating between distant languages but also between closely-related languages or even different spoken varieties of the same language.

Triplet Entropy Loss: Improving The Generalisation of Short Speech Language Identification Systems

ruanvdmerwe/triplet-entropy-loss 3 Dec 2020

Even though the models trained using Triplet Entropy Loss showed a better understanding of the languages and higher accuracies, it appears as though the models still memorise word patterns present in the spectrograms rather than learning the finer nuances of a language.

BERT-LID: Leveraging BERT to Improve Spoken Language Identification

thusatlab/bert-lid 1 Mar 2022

It has a profound impact on the multilingual interoperability of an intelligent speech system.

EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use

cpjku/efficientleaf 12 Jul 2022

In audio classification, differentiable auditory filterbanks with few parameters cover the middle ground between hard-coded spectrograms and raw audio.

Distilled Non-Semantic Speech Embeddings with Binary Neural Networks for Low-Resource Devices

HarlinLee/brillsson 12 Jul 2022

This work introduces BRILLsson, a novel binary neural network-based representation learning model for a broad range of non-semantic speech tasks.

Improving Spoken Language Identification with Map-Mix

skit-ai/map-mix 16 Feb 2023

The pre-trained multi-lingual XLSR model generalizes well for language identification after fine-tuning on unseen languages.

Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech

shashikg/lid-code-switching 1 Jun 2023

This work focuses on improving the Spoken Language Identification (LangId) system for a challenge that focuses on developing robust language identification systems that are reliable for non-standard, accented (Singaporean accent), spontaneous code-switched, and child-directed speech collected via Zoom.