In this work, we develop several novel methods which leverage articulatory features to build phonetically informed word embeddings, and present a set of phonetic word embeddings to encourage their community development, evaluation and use.
Emergent language is unique among fields within the discipline of machine learning for its open-endedness, not obviously presenting well-defined problems to be solved.
Specifically, we propose a variety of psycholinguistic factors -- semantic, distributional, and phonological -- that we hypothesize are predictive of lexical decline, in which words greatly decrease in frequency over time.
Models pre-trained on multiple languages have shown significant promise for improving speech recognition, particularly for low-resource languages.
We introduce polyglot language models, recurrent neural network models trained to predict symbol sequences in many different languages using shared representations of symbols and conditioning on typological information about the language to be predicted.