2 code implementations • COLING 2022 • Kalvin Chang, Chenxuan Cui, Youngmin Kim, David R. Mortensen
Most comparative datasets of Chinese varieties are not digital; however, Wiktionary includes a wealth of transcriptions of words from these varieties.
1 code implementation • 26 Aug 2024 • Kalvin Chang, Yi-Hui Chou, Jiatong Shi, Hsuan-Ming Chen, Nicole Holliday, Odette Scharenborg, David R. Mortensen
Underperformance of ASR systems for speakers of African American Vernacular English (AAVE) and other marginalized language varieties is a well-documented phenomenon, and one that reinforces the stigmatization of these varieties.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 18 Jun 2024 • Atharva Naik, Kexun Zhang, Nathaniel Robinson, Aravind Mysore, Clayton Marr, Hong Sng Rebecca Byrnes, Anna Cai, Kalvin Chang, David Mortensen
Historical linguists have long written a kind of incompletely formalized ''program'' that converts reconstructed words in an ancestor language into words in one of its attested descendants that consist of a series of ordered string rewrite functions (called sound laws).
1 code implementation • 20 Feb 2024 • Ryan Soh-Eun Shim, Kalvin Chang, David R. Mortensen
Received wisdom in linguistic typology holds that if the structure of a language becomes more complex in one dimension, it will simplify in another, building on the assumption that all languages are equally complex (Joseph and Newmeyer, 2012).
1 code implementation • 2 Feb 2024 • Kalvin Chang, Nathaniel R. Robinson, Anna Cai, Ting Chen, Annie Zhang, David R. Mortensen
We describe a set of new methods to partially automate linguistic phylogenetic inference given (1) cognate sets with their respective protoforms and sound laws, (2) a mapping from phones to their articulatory features and (3) a typological database of sound changes.
1 code implementation • 6 Dec 2023 • Yi-Hui Chou, Kalvin Chang, Meng-Ju Wu, Winston Ou, Alice Wen-Hsin Bi, Carol Yang, Bryan Y. Chen, Rong-Wei Pai, Po-Yen Yeh, Jo-Peng Chiang, Iu-Tshian Phoann, Winnie Chang, Chenxuan Cui, Noel Chen, Jiatong Shi
Taiwanese Hokkien is declining in use and status due to a language shift towards Mandarin in Taiwan.
1 code implementation • 4 Jul 2023 • Young Min Kim, Kalvin Chang, Chenxuan Cui, David Mortensen
We update their model with the state-of-the-art seq2seq model: the Transformer.
1 code implementation • 5 Apr 2023 • Vilém Zouhar, Kalvin Chang, Chenxuan Cui, Nathaniel Carlson, Nathaniel Robinson, Mrinmaya Sachan, David Mortensen
Mapping words into a fixed-dimensional vector space is the backbone of modern NLP.