Search Results for author: Karin Verspoor

Found 67 papers, 19 papers with code

Learning from Unlabelled Data for Clinical Semantic Textual Similarity

no code implementations EMNLP (ClinicalNLP) 2020 Yuxia Wang, Karin Verspoor, Timothy Baldwin

Domain pretraining followed by task fine-tuning has become the standard paradigm for NLP tasks, but requires in-domain labelled data for task fine-tuning.

Semantic Textual Similarity Sentence +1

Improved Topic Representations of Medical Documents to Assist COVID-19 Literature Exploration

no code implementations EMNLP (NLP-COVID19) 2020 Yulia Otmakhova, Karin Verspoor, Timothy Baldwin, Simon Šuster

Efficient discovery and exploration of biomedical literature has grown in importance in the context of the COVID-19 pandemic, and topic-based methods such as latent Dirichlet allocation (LDA) are a useful tool for this purpose.

Specificity Topic Models

Noisy Label Regularisation for Textual Regression

1 code implementation COLING 2022 Yuxia Wang, Timothy Baldwin, Karin Verspoor

Training with noisy labelled data is known to be detrimental to model performance, especially for high-capacity neural network models in low-resource domains.

regression

READ-BioMed@SocialDisNER: Adaptation of an Annotation System to Spanish Tweets

no code implementations SMM4H (COLING) 2022 Antonio Jimeno Yepes, Karin Verspoor

We describe the work of the READ-BioMed team for the preparation of a submission to the SocialDisNER Disease Named Entity Recognition (NER) Task (Task 10) in 2022.

named-entity-recognition Named Entity Recognition +1

Cross-linguistic Comparison of Linguistic Feature Encoding in BERT Models for Typologically Different Languages

no code implementations NAACL (SIGTYP) 2022 Yulia Otmakhova, Karin Verspoor, Jey Han Lau

Though recently there have been an increased interest in how pre-trained language models encode different linguistic features, there is still a lack of systematic comparison between languages with different morphology and syntax.

Using Discourse Structure to Differentiate Focus Entities from Background Entities in Scientific Literature

no code implementations ALTA 2021 Antonio Jimeno Yepes, Ameer Albahem, Karin Verspoor

In developing systems to identify focus entities in scientific literature, we face the problem of discriminating key entities of interest from other potentially relevant entities of the same type mentioned in the articles.

The patient is more dead than alive: exploring the current state of the multi-document summarisation of the biomedical literature

no code implementations ACL 2022 Yulia Otmakhova, Karin Verspoor, Timothy Baldwin, Jey Han Lau

Although multi-document summarisation (MDS) of the biomedical literature is a highly valuable task that has recently attracted substantial interest, evaluation of the quality of biomedical summaries lacks consistency and transparency.

Deep Outdated Fact Detection in Knowledge Graphs

no code implementations6 Feb 2024 Huiling Tu, Shuo Yu, Vidya Saikrishna, Feng Xia, Karin Verspoor

Knowledge graphs (KGs) have garnered significant attention for their vast potential across diverse domains.

Knowledge Graphs

EMBRE: Entity-aware Masking for Biomedical Relation Extraction

no code implementations15 Jan 2024 Mingjie Li, Karin Verspoor

Information extraction techniques, including named entity recognition (NER) and relation extraction (RE), are crucial in many domains to support making sense of vast amounts of unstructured text data by identifying and connecting relevant information.

named-entity-recognition Named Entity Recognition +3

Principles from Clinical Research for NLP Model Generalization

no code implementations7 Nov 2023 Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor

In clinical research generalizability depends on (a) internal validity of experiments to ensure controlled measurement of cause and effect, and (b) external validity or transportability of the results to the wider population.

Relation Extraction

Effects of Human Adversarial and Affable Samples on BERT Generalization

no code implementations12 Oct 2023 Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor

BERT-based models have had strong performance on leaderboards, yet have been demonstrably worse in real-world settings requiring generalization.

Relation Extraction text-classification +1

Collective Human Opinions in Semantic Textual Similarity

1 code implementation8 Aug 2023 Yuxia Wang, Shimin Tao, Ning Xie, Hao Yang, Timothy Baldwin, Karin Verspoor

Despite the subjective nature of semantic textual similarity (STS) and pervasive disagreements in STS annotation, existing benchmarks have used averaged human ratings as the gold standard.

Semantic Textual Similarity Sentence +1

Improving Text-based Early Prediction by Distillation from Privileged Time-Series Text

no code implementations26 Jan 2023 Jinghui Liu, Daniel Capurro, Anthony Nguyen, Karin Verspoor

In this study, we propose to treat this neglected text as privileged information available during training to enhance early prediction modeling through knowledge distillation, presented as Learning using Privileged tIme-sEries Text (LuPIET).

Knowledge Distillation Time Series +2

Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation

no code implementations CVPR 2022 Mingjie Li, Wenjia Cai, Karin Verspoor, Shirui Pan, Xiaodan Liang, Xiaojun Chang

To endow models with the capability of incorporating expert knowledge, we propose a Cross-modal clinical Graph Transformer (CGT) for ophthalmic report generation (ORG), in which clinical relation triples are injected into the visual features as prior knowledge to drive the decoding procedure.

Clinical Knowledge Medical Report Generation

Improving negation detection with negation-focused pre-training

no code implementations NAACL 2022 Thinh Hung Truong, Timothy Baldwin, Trevor Cohn, Karin Verspoor

Negation is a common linguistic feature that is crucial in many language understanding tasks, yet it remains a hard problem due to diversity in its expression in different types of text.

Data Augmentation Negation +1

ITTC @ TREC 2021 Clinical Trials Track

no code implementations16 Feb 2022 Thinh Hung Truong, Yulia Otmakhova, Rahmad Mahendra, Timothy Baldwin, Jey Han Lau, Trevor Cohn, Lawrence Cavedon, Damiano Spina, Karin Verspoor

This paper describes the submissions of the Natural Language Processing (NLP) team from the Australian Research Council Industrial Transformation Training Centre (ITTC) for Cognitive Computing in Medical Technologies to the TREC 2021 Clinical Trials Track.

Retrieval

MPVNN: Mutated Pathway Visible Neural Network Architecture for Interpretable Prediction of Cancer-specific Survival Risk

1 code implementation2 Feb 2022 Gourab Ghosh Roy, Nicholas Geard, Karin Verspoor, Shan He

We show that trained MPVNN architecture interpretation, which points to smaller sets of genes connected by signal flow within the PI3K-Akt pathway that are important in risk prediction for particular cancer types, is reliable.

Survival Analysis

Large-scale protein-protein post-translational modification extraction with distant supervision and confidence calibrated BioBERT

1 code implementation6 Jan 2022 Aparna Elangovan, Yuan Li, Douglas E. V. Pires, Melissa J. Davis, Karin Verspoor

However, by combining high confidence and low variation to identify high quality predictions, tuning the predictions for precision, we retained 19% of the test predictions with 100% precision.

Memorization vs. Generalization : Quantifying Data Leakage in NLP Performance Evaluation

no code implementations EACL 2021 Aparna Elangovan, Jiayuan He, Karin Verspoor

Public datasets are often used to evaluate the efficacy and generalizability of state-of-the-art methods for many tasks in natural language processing (NLP).

Memorization named-entity-recognition +3

Memorization vs. Generalization: Quantifying Data Leakage in NLP Performance Evaluation

1 code implementation3 Feb 2021 Aparna Elangovan, Jiayuan He, Karin Verspoor

Public datasets are often used to evaluate the efficacy and generalizability of state-of-the-art methods for many tasks in natural language processing (NLP).

Memorization named-entity-recognition +3

Assigning function to protein-protein interactions: a weakly supervised BioBERT based approach using PubMed abstracts

no code implementations20 Aug 2020 Aparna Elangovan, Melissa Davis, Karin Verspoor

Motivation: Protein-protein interactions (PPI) are critical to the function of proteins in both normal and diseased cells, and many critical protein functions are mediated by interactions. Knowledge of the nature of these interactions is important for the construction of networks to analyse biological data.

COVID-SEE: Scientific Evidence Explorer for COVID-19 Related Research

no code implementations18 Aug 2020 Karin Verspoor, Simon Šuster, Yulia Otmakhova, Shevon Mendis, Zenan Zhai, Biaoyan Fang, Jey Han Lau, Timothy Baldwin, Antonio Jimeno Yepes, David Martinez

We present COVID-SEE, a system for medical literature discovery based on the concept of information exploration, which builds on several distinct text analysis and natural language processing methods to structure and organise information in publications, and augments search by providing a visual overview supporting exploration of a collection to identify key articles of interest.

WikiUMLS: Aligning UMLS to Wikipedia via Cross-lingual Neural Ranking

1 code implementation COLING 2020 Afshin Rahimi, Timothy Baldwin, Karin Verspoor

We present our work on aligning the Unified Medical Language System (UMLS) to Wikipedia, to facilitate manual alignment of the two resources.

Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings

1 code implementation WS 2019 Zenan Zhai, Dat Quoc Nguyen, Saber A. Akhondi, Camilo Thorne, Christian Druckenbrodt, Trevor Cohn, Michelle Gregory, Karin Verspoor

In this paper, we explore the NER performance of a BiLSTM-CRF model utilising pre-trained word embeddings, character-level word representations and contextualized ELMo word representations for chemical patents.

named-entity-recognition Named Entity Recognition +2

End-to-end neural relation extraction using deep biaffine attention

1 code implementation29 Dec 2018 Dat Quoc Nguyen, Karin Verspoor

We propose a neural network model for joint extraction of named entities and relations between them, without any hand-crafted features.

General Classification Relation +1

Findings of the WMT 2018 Biomedical Translation Shared Task: Evaluation on Medline test sets

no code implementations WS 2018 Mariana Neves, Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Cristian Grozea, Amy Siu, Madeleine Kittner, Karin Verspoor

Machine translation enables the automatic translation of textual documents between languages and can facilitate access to information only available in a given language for non-speakers of this language, e. g. research results presented in scientific publications.

Machine Translation Translation

Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition

no code implementations WS 2018 Zenan Zhai, Dat Quoc Nguyen, Karin Verspoor

We compare the use of LSTM-based and CNN-based character-level word embeddings in BiLSTM-CRF models to approach chemical and disease named entity recognition (NER) tasks.

named-entity-recognition Named Entity Recognition +2

From POS tagging to dependency parsing for biomedical event extraction

2 code implementations11 Aug 2018 Dat Quoc Nguyen, Karin Verspoor

Results: We perform an empirical study comparing state-of-the-art traditional feature-based and neural network-based models for two core natural language processing tasks of part-of-speech (POS) tagging and dependency parsing on two benchmark biomedical corpora, GENIA and CRAFT.

Dependency Parsing Event Extraction +3

Adjusting for Chance Clustering Comparison Measures

no code implementations3 Dec 2015 Simone Romano, Nguyen Xuan Vinh, James Bailey, Karin Verspoor

In particular, the Adjusted Rand Index (ARI) based on pair-counting, and the Adjusted Mutual Information (AMI) based on Shannon information theory are very popular in the clustering community.

Clustering

A Framework to Adjust Dependency Measure Estimates for Chance

no code implementations27 Oct 2015 Simone Romano, Nguyen Xuan Vinh, James Bailey, Karin Verspoor

For example: non-linear dependencies between two continuous variables can be explored with the Maximal Information Coefficient (MIC); and categorical variables that are dependent to the target class are selected using Gini gain in random forests.

Cannot find the paper you are looking for? You can Submit a new open access paper.