no code implementations • ACL 2018 • Milton King, Paul Cook
In this paper we propose and evaluate models for classifying VNC usages as idiomatic or literal, based on a variety of approaches to forming distributed representations.
no code implementations • SEMEVAL 2018 • Milton King, Ali Hakimi Parizi, Paul Cook
In this paper we present three unsupervised models for capturing discriminative attributes based on information from word embeddings, WordNet, and sentence-level word co-occurrence frequency.
no code implementations • SEMEVAL 2017 • Waseem Gharbieh, Virendrakumar Bhavsar, Paul Cook
Multiword expressions (MWEs) are lexical items that can be decomposed into multiple component words, but have properties that are unpredictable with respect to their component words.
no code implementations • COLING 2018 • Ali Hakimi Parizi, Paul Cook
In this paper, we propose the first model for multiword expression (MWE) compositionality prediction based on character-level neural network language models.
no code implementations • WS 2017 • Milton King, Paul Cook
Usage similarity (USim) is an approach to determining word meaning in context that does not rely on a sense inventory.
no code implementations • COLING 2016 • Bahar Salehi, Paul Cook, Timothy Baldwin
Much previous research on multiword expressions (MWEs) has focused on the token- and type-level tasks of MWE identification and extraction, respectively.
no code implementations • SEMEVAL 2019 • Ali Hakimi Parizi, Milton King, Paul Cook
In this paper we apply a range of approaches to language modeling {--} including word-level n-gram and neural language models, and character-level neural language models {--} to the problem of detecting hate speech and offensive language.
no code implementations • LREC 2016 • Richard Fothergill, Paul Cook, Timothy Baldwin
Web corpora are often constructed automatically, and their contents are therefore often not well understood.
no code implementations • LREC 2016 • SoHyun Park, Afsaneh Fazly, Annie Lee, Br Seibel, on, Wenjie Zi, Paul Cook
We then propose a supervised approach to classify out-of-vocabulary terms according to these categories, drawing on features based on word embeddings, and linguistic knowledge of common properties of out-of-vocabulary terms.
no code implementations • LREC 2020 • Milton King, Paul Cook
In this work, we consider the problem of personalizing language models, that is, building language models that are tailored to the writing style of an individual.
no code implementations • LREC 2020 • Ali Hakimi Parizi, Paul Cook
This is particularly problematic for low-resource and morphologically-rich languages, which often have relatively high OOV rates.
Bilingual Lexicon Induction Cross-Lingual Word Embeddings +1
no code implementations • LREC 2020 • Jeremie Boudreau, Akankshya Patra, Ashima Suvarna, Paul Cook
In this paper we consider a range of n-gram and RNN language models for Mi{'}kmaq.
no code implementations • 12 Jun 2020 • Arman Kabiri, Paul Cook
Most prior work on definition modeling has not accounted for polysemy, or has done so by considering definition modeling for a target word in a given context.
no code implementations • Joint Conference on Lexical and Computational Semantics 2020 • Ali Hakimi Parizi, Paul Cook
In this paper, we propose a novel method for learning cross-lingual word embeddings, that incorporates sub-word information during training, and is able to learn high-quality embeddings from modest amounts of monolingual data and a bilingual lexicon.
no code implementations • Joint Conference on Lexical and Computational Semantics 2021 • Ali Hakimi Parizi, Paul Cook
Cross-lingual word embeddings provide a way for information to be transferred between languages.
Bilingual Lexicon Induction Cross-Lingual Word Embeddings +1
no code implementations • SEMEVAL 2021 • Milton King, Ali Hakimi Parizi, Samin Fakharian, Paul Cook
In this paper, we present three supervised systems for English lexical complexity prediction of single and multiword expressions for SemEval-2021 Task 1.
1 code implementation • ACL (MWE) 2021 • Samin Fakharian, Paul Cook
We consider monolingual experiments for English and Russian, and show that the proposed model outperforms previous approaches, including in the case that the model is tested on instances of PIE types that were not observed during training.
no code implementations • RANLP 2021 • Diego Bear, Paul Cook
In this paper, we consider cross-lingual definition generation.
no code implementations • RANLP 2021 • Milton King, Paul Cook
We propose a novel WSD dataset and show that personalizing a WSD system with knowledge of an author’s sense distributions or predominant senses can greatly increase its performance.
no code implementations • LREC 2022 • Diego Bear, Paul Cook
As there exist no large corpora of running text for Wolastoqey, in this paper, we leverage a bilingual dictionary to learn Wolastoqey word embeddings by encoding their corresponding English definitions into vector representations using pretrained English word and sequence representation models.
no code implementations • SIGUL (LREC) 2022 • Diego Bear, Paul Cook
Finite-state approaches to morphological analysis have been shown to improve the performance of natural language processing systems for polysynthetic languages, in-which words are generally composed of many morphemes, for tasks such as language modelling (Schwartz et al., 2020).