Search Results for author: Tanja Samardzic

Found 5 papers, 2 papers with code

ASR for Non-standardised Languages with Dialectal Variation: the case of Swiss German

no code implementations VarDial (COLING) 2020 Iuliia Nigmatulina, Tannon Kew, Tanja Samardzic

A formal comparison shows that the system trained on the normalised transcriptions achieves better results in word error rate (WER) (29. 39%) but underperforms at the character level, suggesting dialectal transcriptions offer a viable solution for downstream applications where dialectal differences are important.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Interpretability for Morphological Inflection: from Character-level Predictions to Subword-level Rules

1 code implementation EACL 2021 Tatyana Ruzsics, Olga Sozinova, Ximena Gutierrez-Vasques, Tanja Samardzic

We apply our methodology to analyze the model{'}s decisions on three typologically-different languages and find that a) our pattern extraction method applied to cross-attention weights uncovers variation in form of inflection morphemes, b) pattern extraction from self-attention shows triggers for such variation, c) both types of patterns are closely aligned with grammar inflection classes and class assignment criteria, for all three languages.

Morphological Inflection

From characters to words: the turning point of BPE merges

1 code implementation EACL 2021 Ximena Gutierrez-Vasques, Christian Bentz, Olga Sozinova, Tanja Samardzic

The distributions of orthographic word types are very different across languages due to typological characteristics, different writing traditions and potentially other factors.

Cannot find the paper you are looking for? You can Submit a new open access paper.