no code implementations • LREC 2022 • David R. Mortensen, Xinyu Zhang, Chenxuan Cui, Katherine Zhang
This paper describes the first publicly available corpus of Hmong, a minority language of China, Vietnam, Laos, Thailand, and various countries in Europe and the Americas.
2 code implementations • COLING 2022 • Kalvin Chang, Chenxuan Cui, Youngmin Kim, David R. Mortensen
Most comparative datasets of Chinese varieties are not digital; however, Wiktionary includes a wealth of transcriptions of words from these varieties.
no code implementations • 24 Apr 2024 • Chenxuan Cui, Ying Chen, Qinxin Wang, David R. Mortensen
Proto-form reconstruction has been a painstaking process for linguists.
1 code implementation • 6 Dec 2023 • Yi-Hui Chou, Kalvin Chang, Meng-Ju Wu, Winston Ou, Alice Wen-Hsin Bi, Carol Yang, Bryan Y. Chen, Rong-Wei Pai, Po-Yen Yeh, Jo-Peng Chiang, Iu-Tshian Phoann, Winnie Chang, Chenxuan Cui, Noel Chen, Jiatong Shi
Taiwanese Hokkien is declining in use and status due to a language shift towards Mandarin in Taiwan.
1 code implementation • 4 Jul 2023 • Young Min Kim, Kalvin Chang, Chenxuan Cui, David Mortensen
We update their model with the state-of-the-art seq2seq model: the Transformer.
1 code implementation • 5 Apr 2023 • Vilém Zouhar, Kalvin Chang, Chenxuan Cui, Nathaniel Carlson, Nathaniel Robinson, Mrinmaya Sachan, David Mortensen
Mapping words into a fixed-dimensional vector space is the backbone of modern NLP.
no code implementations • NAACL 2022 • Chenxuan Cui, Katherine J. Zhang, David R. Mortensen
Mortensen (2006) claims that (1) the linear ordering of EEs and CCs in Hmong, Lahu, and Chinese can be predicted via phonological hierarchies and (2) these phonological hierarchies lack a clear phonetic rationale.