Search Results for author: Attapol T. Rutherford

Found 4 papers, 1 papers with code

PhayaThaiBERT: Enhancing a Pretrained Thai Language Model with Unassimilated Loanwords

1 code implementation21 Nov 2023 Panyut Sriwirote, Jalinee Thapiang, Vasan Timtong, Attapol T. Rutherford

While WangchanBERTa has become the de facto standard in transformer-based Thai language modeling, it still has shortcomings in regard to the understanding of foreign words, most notably English words, which are often borrowed without orthographic assimilation into Thai in many contexts.

Language Modelling

More Than Words: Collocation Tokenization for Latent Dirichlet Allocation Models

no code implementations24 Aug 2021 Jin Cheevaprawatdomrong, Alexandra Schofield, Attapol T. Rutherford

Traditionally, Latent Dirichlet Allocation (LDA) ingests words in a collection of documents to discover their latent topics using word-document co-occurrences.

Cannot find the paper you are looking for? You can Submit a new open access paper.