Design principles of an open-source language modeling microservice package for AAC text-entry applications

no code implementations SLPAT (ACL) 2022 Brian Roark, Alexander Gutkin

We present MozoLM, an open-source language model microservice package intended for use in AAC text-entry applications, with a particular focus on the design principles of the library.

Language Modelling

The Taxonomy of Writing Systems: How to Measure How Logographic a System Is

no code implementations CL (ACL) 2021 Richard Sproat, Alexander Gutkin

Our work provides the first quantifiable measure of the notion of logography that accords with linguistic intuition and, we argue, provides better insight into what this notion means.

Towards Induction of Structured Phoneme Inventories

no code implementations12 Oct 2020 Alexander Gutkin, Martin Jansche, Lucy Skidmore

This extended abstract surveying the work on phonological typology was prepared for "SIGTYP 2020: The Second Workshop on Computational Research in Linguistic Typology" to be held at EMNLP 2020.

NEMO: Frequentist Inference Approach to Constrained Linguistic Typology Feature Prediction in SIGTYP 2020 Shared Task

1 code implementation EMNLP (SIGTYP) 2020 Alexander Gutkin, Richard Sproat

This paper describes the NEMO submission to SIGTYP 2020 shared task which deals with prediction of linguistic typological features for multiple languages using the data derived from World Atlas of Language Structures (WALS).

Linguistic Typology Features from Text: Inferring the Sparse Features of World Atlas of Language Structures

no code implementations30 Apr 2020 Alexander Gutkin, Tatiana Merkulova, Martin Jansche

In this paper we investigate whether the various linguistic features from World Atlas of Language Structures (WALS) can be reliably inferred from multi-lingual text.

Multi-Label Classification Natural Language Processing

Sampling from Stochastic Finite Automata with Applications to CTC Decoding

1 code implementation21 May 2019 Martin Jansche, Alexander Gutkin

We consider the problem of efficient sampling: drawing random string variates from the probability distribution represented by stochastic automata and transformations of those.

