Search Results for author: Paul Rayson

Found 36 papers, 10 papers with code

Understanding who uses Reddit: Profiling individuals with a self-reported bipolar disorder diagnosis

1 code implementation NAACL (CLPsych) 2021 Glorianna Jagfeld, Fiona Lobban, Paul Rayson, Steven H. Jones

Recently, research on mental health conditions using public online data, including Reddit, has surged in NLP and health research but has not reported user characteristics, which are important to judge generalisability of findings.

GIBBON: General-purpose Information-Based Bayesian OptimisatioN

no code implementations5 Feb 2021 Henry B. Moss, David S. Leslie, Javier Gonzalez, Paul Rayson

This paper describes a general-purpose extension of max-value entropy search, a popular approach for Bayesian Optimisation (BO).

Bayesian Optimisation Point Processes

The National Corpus of Contemporary Welsh: Project Report | Y Corpws Cenedlaethol Cymraeg Cyfoes: Adroddiad y Prosiect

no code implementations12 Oct 2020 Dawn Knight, Steve Morris, Tess Fitzpatrick, Paul Rayson, Irena Spasić, Enlli Môn Thomas

This report provides an overview of the CorCenCC project and the online corpus resource that was developed as a result of work on the project.

BOSS: Bayesian Optimization over String Spaces

1 code implementation NeurIPS 2020 Henry B. Moss, Daniel Beck, Javier Gonzalez, David S. Leslie, Paul Rayson

This article develops a Bayesian optimization (BO) method which acts directly over raw strings, proposing the first uses of string kernels and genetic algorithms within BO loops.

BOSH: Bayesian Optimization by Sampling Hierarchically

no code implementations2 Jul 2020 Henry B. Moss, David S. Leslie, Paul Rayson

Deployments of Bayesian Optimization (BO) for functions with stochastic evaluations, such as parameter tuning via cross validation and simulation optimization, typically optimize an average of a fixed set of noisy realizations of the objective function.

reinforcement-learning

MUMBO: MUlti-task Max-value Bayesian Optimization

no code implementations22 Jun 2020 Henry B. Moss, David S. Leslie, Paul Rayson

MUMBO is scalable and efficient, allowing multi-task Bayesian optimization to be deployed in problems with rich parameter and fidelity spaces.

Infrastructure for Semantic Annotation in the Genomics Domain

no code implementations LREC 2020 Mahmoud El-Haj, Nathan Rutherford, Matthew Coole, Ignatius Ezeani, Sheryl Prentice, Nancy Ide, Jo Knight, Scott Piao, John Mariani, Paul Rayson, Keith Suderman

The corpus database is distributed to permit fast indexing, and provides a simple web front-end with corpus linguistics methods for sub-corpus comparison and retrieval.

Igbo-English Machine Translation: An Evaluation Benchmark

no code implementations1 Apr 2020 Ignatius Ezeani, Paul Rayson, Ikechukwu Onyenwe, Chinedu Uchechukwu, Mark Hepple

Although researchers and practitioners are pushing the boundaries and enhancing the capacities of NLP tools and methods, works on African languages are lagging.

Machine Translation Part-Of-Speech Tagging +1

Leveraging Pre-Trained Embeddings for Welsh Taggers

no code implementations WS 2019 Ignatius Ezeani, Scott Piao, Steven Neale, Paul Rayson, Dawn Knight

While the application of word embedding models to downstream Natural Language Processing (NLP) tasks has been shown to be successful, the benefits for low-resource languages is somewhat limited due to lack of adequate data for training the models.

FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms

1 code implementation ACL 2019 Henry B. Moss, Andrew Moore, David S. Leslie, Paul Rayson

We present FIESTA, a model selection approach that significantly reduces the computational resources required to reliably identify state-of-the-art performance from large collections of candidate models.

Model Selection Sentiment Analysis

In Search of Meaning: Lessons, Resources and Next Steps for Computational Analysis of Financial Discourse

no code implementations28 Mar 2019 Mahmoud El-Haj, Paul Rayson, Martin Walker, Steven Young, Vasiliki Simaki

We critically assess mainstream accounting and finance research applying methods from computational linguistics (CL) to study financial discourse.

Named Entity Recognition NER +1

Using J-K-fold Cross Validation To Reduce Variance When Tuning NLP Models

1 code implementation COLING 2018 Henry Moss, David Leslie, Paul Rayson

K-fold cross validation (CV) is a popular method for estimating the true performance of machine learning models, allowing model selection and parameter tuning.

Document Classification General Classification +3

Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models

1 code implementation19 Jun 2018 Henry B. Moss, David S. Leslie, Paul Rayson

K-fold cross validation (CV) is a popular method for estimating the true performance of machine learning models, allowing model selection and parameter tuning.

Document Classification General Classification +3

Lancaster A at SemEval-2017 Task 5: Evaluation metrics matter: predicting sentiment from financial news headlines

1 code implementation SEMEVAL 2017 Andrew Moore, Paul Rayson

This paper describes our participation in Task 5 track 2 of SemEval 2017 to predict the sentiment of financial news headlines for a specific company on a continuous scale between -1 and 1.

Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages

1 code implementation LREC 2016 Scott Piao, Paul Rayson, Dawn Archer, Francesca Bianchi, Carmen Dayrell, Mahmoud El-Haj, Ricardo-Mar{\'\i}a Jim{\'e}nez, Dawn Knight, Michal K{\v{r}}en, Laura L{\"o}fberg, Rao Muhammad Adeel Nawab, Jawad Shafi, Phoey Lee Teh, Olga Mudraya

Lexical coverage is an important factor concerning the quality of the lexicons and the performance of the corpus annotation tools, and in this experiment we focus on evaluating the lexical coverage achieved by the multilingual lexicons and semantic annotation tools based on them.

Learning Tone and Attribution for Financial Text Mining

no code implementations LREC 2016 Mahmoud El-Haj, Paul Rayson, Steve Young, Andrew Moore, Martin Walker, Thomas Schleicher, Vasiliki Athanasakou

Previous studies have only applied manual content analysis on a small scale to reveal such a bias in the narrative section of annual financial reports.

UPPC - Urdu Paraphrase Plagiarism Corpus

no code implementations LREC 2016 Muhammad Sharjeel, Paul Rayson, Rao Muhammad Adeel Nawab

Paraphrase plagiarism is a significant and widespread problem and research shows that it is hard to detect.

OSMAN ― A Novel Arabic Readability Metric

no code implementations LREC 2016 Mahmoud El-Haj, Paul Rayson

The Arabic sentences were written with the absence of diacritics and in order to count the number of syllables we added the diacritics in using an open source tool called Mishkal.

Detecting Document Structure in a Very Large Corpus of UK Financial Reports

no code implementations LREC 2014 Mahmoud El-Haj, Paul Rayson, Steve Young, Martin Walker

In this paper we present the evaluation of our automatic methods for detecting and extracting document structure in annual financial reports.

Text Generation

Experiences with Parallelisation of an Existing NLP Pipeline: Tagging Hansard

no code implementations LREC 2014 Stephen Wattam, Paul Rayson, Alex, Marc er, Jean Anderson

This is contrasted with a description of the cluster on which it was to run, and specific limitations are discussed such as the overhead of using SAN-based storage.

Document Attrition in Web Corpora: an Exploration

no code implementations LREC 2012 Stephen Wattam, Paul Rayson, Damon Berridge

Increases in the use of web data for corpus-building, coupled with the use of specialist, single-use corpora, make for an increasing reliance on language that changes quickly, affecting the long-term validity of studies based on these methods.

Cannot find the paper you are looking for? You can Submit a new open access paper.