Search Results for author: John P. McCrae

Found 49 papers, 11 papers with code

Constructing an Annotated Corpus of Verbal MWEs for English

no code implementations COLING 2018 Abigail Walsh, Claire Bonial, Kristina Geeraert, John P. McCrae, Nathan Schneider, Clarissa Somers

This paper describes the construction and annotation of a corpus of verbal MWEs for English, as part of the PARSEME Shared Task 1. 1 on automatic identification of verbal MWEs.

Word Alignment

Temporal Analysis of Entity Relatedness and its Evolution using Wikipedia and DBpedia

no code implementations12 Dec 2018 Narumol Prangnawarat, John P. McCrae, Conor Hayes

We then show that integrating multiple time frames in our methods can give a better overall similarity demonstrating that temporal evolution can have an important effect on entity relatedness.

Semantic Similarity Semantic Textual Similarity

Classification Benchmarks for Under-resourced Bengali Language based on Multichannel Convolutional-LSTM Network

1 code implementation11 Apr 2020 Md. Rezaul Karim, Bharathi Raja Chakravarthi, John P. McCrae, Michael Cochez

Evaluations against several baseline embedding models, e. g., Word2Vec and GloVe yield up to 92. 30%, 82. 25%, and 90. 45% F1-scores in case of document classification, sentiment analysis, and hate speech detection, respectively during 5-fold cross-validation tests.

Classification Document Classification +4

A Survey of Orthographic Information in Machine Translation

no code implementations4 Aug 2020 Bharathi Raja Chakravarthi, Priya Rani, Mihael Arcan, John P. McCrae

It introduces under-resourced languages in terms of machine translation and how orthographic information can be utilised to improve machine translation.

Bilingual Lexicon Induction Translation

Unsupervised Deep Language and Dialect Identification for Short Texts

no code implementations COLING 2020 Koustava Goswami, Rajdeep Sarkar, Bharathi Raja Chakravarthi, Theodorus Fransen, John P. McCrae

Automatic Language Identification (LI) or Dialect Identification (DI) of short texts of closely related languages or dialects, is one of the primary steps in many natural language processing pipelines.

Dialect Identification Sentence +1

Empowering recommender systems using automatically generated Knowledge Graphs and Reinforcement Learning

1 code implementation11 Jul 2023 Ghanshyam Verma, Shovon Sengupta, Simon Simanta, Huan Chen, Janos A. Perge, Devishree Pillai, John P. McCrae, Paul Buitelaar

Personalized recommendations have a growing importance in direct marketing, which motivates research to enhance customer experiences by knowledge graph (KG) applications.

Decision Making Knowledge Graphs +3

Weakly-supervised Deep Cognate Detection Framework for Low-Resourced Languages Using Morphological Knowledge of Closely-Related Languages

1 code implementation9 Nov 2023 Koustava Goswami, Priya Rani, Theodorus Fransen, John P. McCrae

We train an encoder to gain morphological knowledge of a language and transfer the knowledge to perform unsupervised and weakly-supervised cognate detection tasks with and without the pivot language for the closely-related languages.

Information Retrieval named-entity-recognition +3

Text Detoxification as Style Transfer in English and Hindi

no code implementations12 Feb 2024 Sourabrata Mukherjee, Akanksha Bansal, Atul Kr. Ojha, John P. McCrae, Ondřej Dušek

This task contributes to safer and more respectful online communication and can be considered a Text Style Transfer (TST) task, where the text style changes while its content is preserved.

Multi-Task Learning Sentence +2

MaCmS: Magahi Code-mixed Dataset for Sentiment Analysis

no code implementations7 Mar 2024 Priya Rani, Gaurav Negi, Theodorus Fransen, John P. McCrae

The present paper introduces new sentiment data, MaCMS, for Magahi-Hindi-English (MHE) code-mixed language, where Magahi is a less-resourced minority language.

Sentiment Analysis

Towards a Linking between WordNet and Wikidata

no code implementations EACL (GWC) 2021 John P. McCrae, David Cillessen

WordNet is the most widely used lexical resource for English, while Wikidata is one of the largest knowledge graphs of entity and concepts available.

Knowledge Graphs

Mapping WordNet Instances to Wikipedia

no code implementations GWC 2018 John P. McCrae

Lexical resource differ from encyclopaedic resources and represent two distinct types of resource covering general language and named entities respectively.

Improving Wordnets for Under-Resourced Languages Using Machine Translation

no code implementations GWC 2018 Bharathi Raja Chakravarthi, Mihael Arcan, John P. McCrae

In addition to that, we carried out a manual evaluation of the translations for the Tamil language, where we demonstrate that our approach can aid in improving wordnet resources for under-resourced Dravidian languages.

Machine Translation Translation

Towards a Crowd-Sourced WordNet for Colloquial English

no code implementations GWC 2018 John P. McCrae, Ian Wood, Amanda Hicks

Princeton WordNet is one of the most widely-used resources for natural language processing, but is updated only infrequently and cannot keep up with the fast-changing usage of the English language on social media platforms such as Twitter.

Bilingual Lexicon Induction across Orthographically-distinct Under-Resourced Dravidian Languages

no code implementations VarDial (COLING) 2020 Bharathi Raja Chakravarthi, Navaneethan Rajasekaran, Mihael Arcan, Kevin McGuinness, Noel E. O’Connor, John P. McCrae

Bilingual lexicons are a vital tool for under-resourced languages and recent state-of-the-art approaches to this leverage pretrained monolingual word embeddings using supervised or semi-supervised approaches.

Bilingual Lexicon Induction Word Embeddings

The GlobalWordNet Formats: Updates for 2020

1 code implementation EACL (GWC) 2021 John P. McCrae, Michael Wayne Goodman, Francis Bond, Alexandre Rademaker, Ewa Rudnicka, Luis Morgado Da Costa

The Global Wordnet Formats have been introduced to enable wordnets to have a common representation that can be integrated through the Global WordNet Grid.

Cross-lingual Sentence Embedding using Multi-Task Learning

no code implementations EMNLP 2021 Koustava Goswami, Sourav Dutta, Haytham Assem, Theodorus Fransen, John P. McCrae

We demonstrate the efficacy of an unsupervised as well as a weakly supervised variant of our framework on STS, BUCC and Tatoeba benchmark tasks.

Multi-Task Learning Semantic Similarity +6

CogALex-VI Shared Task: Bidirectional Transformer based Identification of Semantic Relations

no code implementations COLING (CogALex) 2020 Saurav Karmakar, John P. McCrae

This paper presents a bidirectional transformer based approach for recognising semantic relationships between a pair of words as proposed by CogALex VI shared task in 2020.

Towards the Construction of a WordNet for Old English

no code implementations LREC 2022 Fahad Khan, Francisco J. Minaya Gómez, Rafael Cruz González, Harry Diakoff, Javier E. Diaz Vera, John P. McCrae, Ciara O’Loughlin, William Michael Short, Sander Stolk

In this paper we will discuss our preliminary work towards the construction of a WordNet for Old English, taking our inspiration from other similar WN construction projects for ancient languages such as Ancient Greek, Latin and Sanskrit.

ULD-NUIG at Social Media Mining for Health Applications (#SMM4H) Shared Task 2021

no code implementations NAACL (SMM4H) 2021 Atul Kr. Ojha, Priya Rani, Koustava Goswami, Bharathi Raja Chakravarthi, John P. McCrae

Social media platforms such as Twitter and Facebook have been utilised for various research studies, from the cohort-level discussion to community-driven approaches to address the challenges in utilizing social media data for health, clinical and biomedical information.

named-entity-recognition Named Entity Recognition +1

English WordNet 2019 – An Open-Source WordNet for English

1 code implementation GWC 2019 John P. McCrae, Alexandre Rademaker, Francis Bond, Ewa Rudnicka, Christiane Fellbaum

We describe the release of a new wordnet for English based on the Princeton WordNet, but now developed under an open-source model.

Cannot find the paper you are looking for? You can Submit a new open access paper.