no code implementations • NAACL (SIGTYP) 2021 • Roman Bedyakin, Nikolay Mikhaylovskiy
This memo describes NTR-TSU submission for SIGTYP 2021 Shared Task on predicting language IDs from speech.
no code implementations • 8 Feb 2022 • Eduard Zubchuk, Dmitry Menshikov, Nikolay Mikhaylovskiy
Kiosks are a popular self-service option in many fast-food restaurants, they save time for the visitors and save labor for the fast-food chains.
no code implementations • 31 May 2021 • Roman Bedyakin, Nikolay Mikhaylovskiy
In this memo, we show that a convolutional neural network with a Self-Attentive Pooling layer shows promising results in low-resource setting for the language identification task and set up a SOTA for the Low Resource ASR challenge dataset.
no code implementations • 24 Apr 2021 • Roman Bedyakin, Nikolay Mikhaylovskiy
This memo describes NTR-TSU submission for SIGTYP 2021 Shared Task on predicting language IDs from speech.
1 code implementation • 30 Mar 2021 • Rostislav Kolobov, Olga Okhapkina, Olga Omelchishina, Andrey Platunov, Roman Bedyakin, Vyacheslav Moshkin, Dmitry Menshikov, Nikolay Mikhaylovskiy
The performance of automated speech recognition (ASR) systems is well known to differ for varied application domains.
Ranked #1 on
Speech Recognition
on MediaSpeech
1 code implementation • 12 Jan 2021 • Roman Vygon, Nikolay Mikhaylovskiy
In the past few years, triplet loss-based metric embeddings have become a de-facto standard for several important computer vision problems, most no-tably, person reidentification.
Ranked #1 on
Keyword Spotting
on Google Speech Commands