no code implementations • 28 May 2024 • Xiang Dai, Sarvnaz Karimi, Abeed Sarker, Ben Hachey, Cecile Paris
Domain generalisation - the ability of a machine learning model to perform well on new, unseen domains (text types) - is under-explored.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris
Recent studies on domain-specific BERT models show that effectiveness on downstream tasks can be improved when models are pretrained on in-domain data.
Ranked #3 on Clinical Concept Extraction on 2010 i2b2/VA
1 code implementation • ACL 2020 • Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris
Unlike widely used Named Entity Recognition (NER) data sets in generic domains, biomedical NER data sets often contain mentions consisting of discontinuous spans.
no code implementations • ACL 2019 • Nicky Ringland, Xiang Dai, Ben Hachey, Sarvnaz Karimi, Cecile Paris, James R. Curran
Named entity recognition (NER) is widely used in natural language processing applications and downstream tasks.
1 code implementation • NAACL 2019 • Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris
Word vectors and Language Models (LMs) pretrained on a large amount of unlabelled data can dramatically improve various Natural Language Processing (NLP) tasks.
Ranked #1 on Named Entity Recognition (NER) on WetLab
no code implementations • WS 2018 • Kylie Radford, Louise Lavrencic, Ruth Peters, Kim Kiely, Ben Hachey, Scott Nowson, Will Radford
The CLPsych 2018 Shared Task B explores how childhood essays can predict psychological distress throughout the author{'}s life.
no code implementations • ACL 2017 • Sam Wei, Igor Korostil, Joel Nothman, Ben Hachey
We propose novel radical features from automatic translation for event extraction.
1 code implementation • EACL 2017 • Andrew Chisholm, Will Radford, Ben Hachey
We investigate the generation of one-sentence Wikipedia biographies from facts derived from Wikidata slot-value pairs.
no code implementations • 20 Feb 2017 • Bo Han, Will Radford, Anaïs Cadilhac, Art Harol, Andrew Chisholm, Ben Hachey
Text generation is increasingly common but often requires manual post-editing where high precision is critical to end users.
no code implementations • ALTA 2016 • Xavier Holt, Will Radford, Ben Hachey
The timeline generation task summarises an entity's biography by selecting stories representing key events from a large pool of relevant documents.
no code implementations • 7 Nov 2016 • Will Radford, Andrew Chisholm, Ben Hachey, Bo Han
We report on an exploratory analysis of Emoji Dick, a project that leverages crowdsourcing to translate Melville's Moby Dick into emoji.
1 code implementation • TACL 2015 • Andrew Chisholm, Ben Hachey
Entity disambiguation with Wikipedia relies on structured information from redirect pages, article text, inter-article links, and categories.