Anlirika: An LSTM–CNN Flow Twister for Spoken Language Identification

NAACL (SIGTYP) 2021 · Andreas Scherbakov, Liam Whittle, Ritesh Kumar, Siddharth Singh, Matthew Coleman, Ekaterina Vylomova ·

The paper presents Anlirika’s submission to SIGTYP 2021 Shared Task on Robust Spoken Language Identification. The task aims at building a robust system that generalizes well across different domains and speakers. The training data is limited to a single domain only with predominantly single speaker per language while the validation and test data samples are derived from diverse dataset and multiple speakers. We experiment with a neural system comprising a combination of dense, convolutional, and recurrent layers that are designed to perform better generalization and obtain speaker-invariant representations. We demonstrate that the task in its constrained form (without making use of external data or augmentation the train set with samples from the validation set) is still challenging. Our best system trained on the data augmented with validation samples achieves 29.9% accuracy on the test data.

PDF Abstract