no code implementations • NLPerspectives (LREC) 2022 • Lucy Havens, Benjamin Bach, Melissa Terras, Beatrice Alex
This paper presents an overview of text visualization techniques relevant for data perspectivism, aiming to facilitate analysis of annotated datasets for the datasets’ creators and stakeholders.
no code implementations • SMM4H (COLING) 2022 • Imane Guellil, Jinge Wu, Honghan Wu, Tony Sun, Beatrice Alex
Our team participated in the tasks related to the Identification of Adverse Drug Events (ADEs), the classification of change in medication (change-med) and the classification of self-report of vaccination (self-vaccine).
no code implementations • COLING (CRAC) 2020 • Vebjørn Espeland, Beatrice Alex, Benjamin Bach
In this paper we describe our attempt to increase the amount of information that can be retrieved through active learning sessions compared to previous approaches.
no code implementations • NAACL (GeBNLP) 2022 • Lucy Havens, Beatrice Alex, Benjamin Bach, Melissa Terras
Mitigating harms from gender biased language in Natural Language Processing (NLP) systems remains a challenge, and the situated nature of language means bias is inescapable in NLP data.
no code implementations • CLTW (LREC) 2022 • William Lamb, Beatrice Alex, Mark Sinclair
Like most other minority languages, Scottish Gaelic has limited tools and resources available for Natural Language Processing research and applications.
no code implementations • CLTW (LREC) 2022 • Lucy Evans, William Lamb, Mark Sinclair, Beatrice Alex
This paper discusses our efforts to develop a full automatic speech recognition (ASR) system for Scottish Gaelic, starting from a point of limited resource.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • BioNLP (ACL) 2022 • Matúš Falis, Hang Dong, Alexandra Birch, Beatrice Alex
We propose data augmentation and synthesis techniques in order to address these scenarios.
1 code implementation • EMNLP (Louhi) 2020 • Andreas Grivas, Beatrice Alex, Claire Grover, Richard Tobin, William Whiteley
Our analysis finds that our rule-based system outperforms the neural models on both datasets and seems to generalise to the out-of-sample dataset.
1 code implementation • 1 Apr 2025 • Lucy Havens, Benjamin Bach, Melissa Terras, Beatrice Alex
Despite numerous efforts to mitigate their biases, ML systems continue to harm already-marginalized people.
no code implementations • 8 Nov 2024 • JA Meaney, Beatrice Alex, William Lamb
Folktales are a rich resource of knowledge about the society and culture of a civilisation.
1 code implementation • 24 Oct 2024 • Aryo Pradipta Gema, Chen Jin, Ahmed Abdulaal, Tom Diethe, Philip Teare, Beatrice Alex, Pasquale Minervini, Amrutha Saseendran
Large Language Models (LLMs) often hallucinate, producing unfaithful or factually incorrect outputs by misrepresenting the provided context or incorrectly recalling internal knowledge.
no code implementations • 22 Oct 2024 • Jesse Phitidis, Alison Q. O'Neil, William N. Whiteley, Beatrice Alex, Joanna M. Wardlaw, Miguel O. Bernabeu, Maria Valdés Hernández
The systems to quantify WMH and atrophy are focused on neurodegenerative disease support, where these CVD markers are also of significance.
no code implementations • 17 Sep 2024 • Andreas Grivas, Claire Grover, Richard Tobin, Clare Llewellyn, Eleojo Oluwaseun Abubakar, Chunyu Zheng, Chris Dibben, Alan Marshall, Jamie Pearce, Beatrice Alex
More specifically, we show how we can use Natural Language Processing (NLP) to unlock further information about neighbourhoods by analysing, geoparsing and clustering news articles.
no code implementations • 23 Jul 2024 • Giorgos Lysandrou, Roma English Owen, Vanja Popovic, Grant Le Brun, Aryo Pradipta Gema, Beatrice Alex, Elizabeth A. L. Fairley
However, the abundance of non-patient posts on social media necessitates filtering out such irrelevant content to distinguish the genuine voices of patients, a task we refer to as patient voice classification.
no code implementations • 20 Jun 2024 • Abul Hasan, Jinge Wu, Quang Ngoc Nguyen, Salomé Andres, Imane Guellil, Huayu Zhang, Arlene Casey, Beatrice Alex, Bruce Guthrie, Honghan Wu
Specifically, using K-Tokeniser, the language models would only require 50\% of the training data to achieve the best performance of the baseline tokeniser using all training data in the concept extraction task and less than 20\% of the data for the automated coding task.
no code implementations • 28 May 2024 • Aryo Pradipta Gema, Chaeeun Lee, Pasquale Minervini, Luke Daines, T. Ian Simpson, Beatrice Alex
The MEDIQA-CORR 2024 shared task aims to assess the ability of Large Language Models (LLMs) to identify and correct medical errors in clinical notes.
1 code implementation • 30 Mar 2024 • Aryo Pradipta Gema, Giwon Hong, Pasquale Minervini, Luke Daines, Beatrice Alex
The NLI4CT task assesses Natural Language Inference systems in predicting whether hypotheses entail or contradict evidence from Clinical Trial Reports.
1 code implementation • 24 Jan 2024 • Matúš Falis, Aryo Pradipta Gema, Hang Dong, Luke Daines, Siddharth Basetti, Michael Holder, Rose S Penfold, Alexandra Birch, Beatrice Alex
Neural coding models were trained on baseline and augmented data and evaluated on a MIMIC-IV test set.
no code implementations • 30 Nov 2023 • Giorgos Lysandrou, Roma English Owen, Vanja Popovic, Grant Le Brun, Beatrice Alex, Elizabeth A. L. Fairley
We used linguistic analysis to understand and identify similarities between datasets, across patient language, between data sources (Reddit, SocialGist) and therapeutic domains (cardiovascular, oncology, immunology, neurology).
1 code implementation • 6 Jul 2023 • Aryo Pradipta Gema, Pasquale Minervini, Luke Daines, Tom Hope, Beatrice Alex
In this study, we propose a two-step PEFT framework and evaluate it in the clinical domain.
1 code implementation • 11 May 2022 • Hang Dong, Víctor Suárez-Paniagua, Huayu Zhang, Minhong Wang, Arlene Casey, Emma Davidson, Jiaoyan Chen, Beatrice Alex, William Whiteley, Honghan Wu
Computational text phenotyping is the practice of identifying patients with certain disorders and traits from clinical notes.
1 code implementation • 21 Mar 2022 • Hang Dong, Matúš Falis, William Whiteley, Beatrice Alex, Joshua Matterson, Shaoxiong Ji, Jiaoyan Chen, Honghan Wu
Knowledge-based methods that represent and reason the standard, explainable process of a task may need to be incorporated into deep learning-based methods for clinical coding.
1 code implementation • EMNLP 2021 • Matúš Falis, Hang Dong, Alexandra Birch, Beatrice Alex
We propose a set of metrics for hierarchical evaluation using the depth-based representation.
Multi Label Text Classification
Multi-Label Text Classification
+1
no code implementations • NAACL (TeachingNLP) 2021 • Beatrice Alex, Clare Llewellyn, Pawel Michal Orzechowski, Maria Boutchkova
In this paper we provide an account of how we ported a text and data mining course online in summer 2020 as a result of the COVID-19 pandemic and how we improved it in a second pilot run.
no code implementations • 18 Feb 2021 • Arlene Casey, Emma Davidson, Michael Poon, Hang Dong, Daniel Duma, Andreas Grivas, Claire Grover, Víctor Suárez-Paniagua, Richard Tobin, William Whiteley, Honghan Wu, Beatrice Alex
Understanding recent developments in NLP application to radiology is of significance but recent reviews on this are limited.
no code implementations • GeBNLP (COLING) 2020 • Lucy Havens, Melissa Terras, Benjamin Bach, Beatrice Alex
We propose a bias-aware methodology to engage with power relations in natural language processing (NLP) research.
no code implementations • LREC 2020 • Rosa Filgueira, Claire Grover, Melissa Terras, Beatrice Alex
This paper describes work in progress on devising automatic and parallel methods for geoparsing large digital historical textual data by combining the strengths of three natural language processing (NLP) tools, the Edinburgh Geoparser, spaCy and defoe, and employing different tokenisation and named entity recognition (NER) techniques.
no code implementations • 4 Feb 2020 • Arlene Casey, Mike Bennett, Richard Tobin, Claire Grover, Iona Walker, Lukas Engelmann, Beatrice Alex
Our interdisciplinary research investigates more than 100 reports from the third plague pandemic (1894-1952) evaluating ways of building a corpus to extract and structure this narrative information through text mining and manual annotation.
no code implementations • 10 Mar 2019 • Philip John Gorinski, Honghan Wu, Claire Grover, Richard Tobin, Conn Talbot, Heather Whalley, Cathie Sudlow, William Whiteley, Beatrice Alex
This work investigates multiple approaches to Named Entity Recognition (NER) for text in Electronic Health Record (EHR) data.
no code implementations • LREC 2016 • Beatrice Alex, Clare Llewellyn, Claire Grover, Oberl, Jon er, Richard Tobin
As tweet-level geotagging remains rare, most prior work exploited tweet content, timezone and network information to inform geolocation, or else relied on off-the-shelf tools to geolocate users from location information in their user profiles.