Search Results for author: Karin Verspoor

Found 68 papers, 19 papers with code

Learning from Unlabelled Data for Clinical Semantic Textual Similarity

no code implementations • EMNLP (ClinicalNLP) 2020 • Yuxia Wang, Karin Verspoor, Timothy Baldwin

Domain pretraining followed by task fine-tuning has become the standard paradigm for NLP tasks, but requires in-domain labelled data for task fine-tuning.

Semantic Textual Similarity Sentence +1

Paper
Add Code

What does it take to bake a cake? The RecipeRef corpus and anaphora resolution in procedural text

1 code implementation • Findings (ACL) 2022 • Biaoyan Fang, Timothy Baldwin, Karin Verspoor

Procedural text contains rich anaphoric phenomena, yet has not received much attention in NLP.

Transfer Learning

Paper
Code

Improved Topic Representations of Medical Documents to Assist COVID-19 Literature Exploration

no code implementations • EMNLP (NLP-COVID19) 2020 • Yulia Otmakhova, Karin Verspoor, Timothy Baldwin, Simon Šuster

Efficient discovery and exploration of biomedical literature has grown in importance in the context of the COVID-19 pandemic, and topic-based methods such as latent Dirichlet allocation (LDA) are a useful tool for this purpose.

Specificity Topic Models

Paper
Add Code

READ-BioMed@SocialDisNER: Adaptation of an Annotation System to Spanish Tweets

no code implementations • SMM4H (COLING) 2022 • Antonio Jimeno Yepes, Karin Verspoor

We describe the work of the READ-BioMed team for the preparation of a submission to the SocialDisNER Disease Named Entity Recognition (NER) Task (Task 10) in 2022.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

Cross-linguistic Comparison of Linguistic Feature Encoding in BERT Models for Typologically Different Languages

no code implementations • NAACL (SIGTYP) 2022 • Yulia Otmakhova, Karin Verspoor, Jey Han Lau

Though recently there have been an increased interest in how pre-trained language models encode different linguistic features, there is still a lack of systematic comparison between languages with different morphology and syntax.

Paper
Add Code

The patient is more dead than alive: exploring the current state of the multi-document summarisation of the biomedical literature

no code implementations • ACL 2022 • Yulia Otmakhova, Karin Verspoor, Timothy Baldwin, Jey Han Lau

Although multi-document summarisation (MDS) of the biomedical literature is a highly valuable task that has recently attracted substantial interest, evaluation of the quality of biomedical summaries lacks consistency and transparency.

Paper
Add Code

Noisy Label Regularisation for Textual Regression

1 code implementation • COLING 2022 • Yuxia Wang, Timothy Baldwin, Karin Verspoor

Training with noisy labelled data is known to be detrimental to model performance, especially for high-capacity neural network models in low-resource domains.

regression

Paper
Code

Using Discourse Structure to Differentiate Focus Entities from Background Entities in Scientific Literature

no code implementations • ALTA 2021 • Antonio Jimeno Yepes, Ameer Albahem, Karin Verspoor

In developing systems to identify focus entities in scientific literature, we face the problem of discriminating key entities of interest from other potentially relevant entities of the same type mentioned in the articles.

Paper
Add Code

Revisiting subword tokenization: A case study on affixal negation in large language models

no code implementations • 3 Apr 2024 • Thinh Hung Truong, Yulia Otmakhova, Karin Verspoor, Trevor Cohn, Timothy Baldwin

In this work, we measure the impact of affixal negation on modern English large language models (LLMs).

Negation Negation Detection

Paper
Add Code

Deep Outdated Fact Detection in Knowledge Graphs

no code implementations • 6 Feb 2024 • Huiling Tu, Shuo Yu, Vidya Saikrishna, Feng Xia, Karin Verspoor

Knowledge graphs (KGs) have garnered significant attention for their vast potential across diverse domains.

Knowledge Graphs

Paper
Add Code

EMBRE: Entity-aware Masking for Biomedical Relation Extraction

no code implementations • 15 Jan 2024 • Mingjie Li, Karin Verspoor

Information extraction techniques, including named entity recognition (NER) and relation extraction (RE), are crucial in many domains to support making sense of vast amounts of unstructured text data by identifying and connecting relevant information.

named-entity-recognition Named Entity Recognition +3

Paper
Add Code

Principles from Clinical Research for NLP Model Generalization

no code implementations • 7 Nov 2023 • Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor

The NLP community typically relies on performance of a model on a held-out test set to assess generalization.

Relation Extraction

Paper
Add Code

Effects of Human Adversarial and Affable Samples on BERT Generalization

no code implementations • 12 Oct 2023 • Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor

BERT-based models have had strong performance on leaderboards, yet have been demonstrably worse in real-world settings requiring generalization.

Relation Extraction text-classification +1

Paper
Add Code

Collective Human Opinions in Semantic Textual Similarity

1 code implementation • 8 Aug 2023 • Yuxia Wang, Shimin Tao, Ning Xie, Hao Yang, Timothy Baldwin, Karin Verspoor

Despite the subjective nature of semantic textual similarity (STS) and pervasive disagreements in STS annotation, existing benchmarks have used averaged human ratings as the gold standard.

Semantic Textual Similarity Sentence +1

Paper
Code

Language models are not naysayers: An analysis of language models on negation benchmarks

1 code implementation • 14 Jun 2023 • Thinh Hung Truong, Timothy Baldwin, Karin Verspoor, Trevor Cohn

Negation has been shown to be a major bottleneck for masked language models, such as BERT.

Negation

Paper
Code

Improving Text-based Early Prediction by Distillation from Privileged Time-Series Text

no code implementations • 26 Jan 2023 • Jinghui Liu, Daniel Capurro, Anthony Nguyen, Karin Verspoor

In this study, we propose to treat this neglected text as privileged information available during training to enhance early prediction modeling through knowledge distillation, presented as Learning using Privileged tIme-sEries Text (LuPIET).

Knowledge Distillation Time Series +2

Paper
Add Code

Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal Negation

1 code implementation • 6 Oct 2022 • Thinh Hung Truong, Yulia Otmakhova, Timothy Baldwin, Trevor Cohn, Jey Han Lau, Karin Verspoor

Negation is poorly captured by current language models, although the extent of this problem is not widely understood.

Natural Language Inference Negation

Paper
Code

LED down the rabbit hole: exploring the potential of global attention for biomedical multi-document summarisation

2 code implementations • sdp (COLING) 2022 • Yulia Otmakhova, Hung Thinh Truong, Timothy Baldwin, Trevor Cohn, Karin Verspoor, Jey Han Lau

In this paper we report on our submission to the Multidocument Summarisation for Literature Review (MSLR) shared task.

Paper
Code

Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation

no code implementations • CVPR 2022 • Mingjie Li, Wenjia Cai, Karin Verspoor, Shirui Pan, Xiaodan Liang, Xiaojun Chang

To endow models with the capability of incorporating expert knowledge, we propose a Cross-modal clinical Graph Transformer (CGT) for ophthalmic report generation (ORG), in which clinical relation triples are injected into the visual features as prior knowledge to drive the decoding procedure.

Clinical Knowledge Medical Report Generation

Paper
Add Code

Improving negation detection with negation-focused pre-training

no code implementations • NAACL 2022 • Thinh Hung Truong, Timothy Baldwin, Trevor Cohn, Karin Verspoor

Negation is a common linguistic feature that is crucial in many language understanding tasks, yet it remains a hard problem due to diversity in its expression in different types of text.

Data Augmentation Negation +1

Paper
Add Code

ITTC @ TREC 2021 Clinical Trials Track

no code implementations • 16 Feb 2022 • Thinh Hung Truong, Yulia Otmakhova, Rahmad Mahendra, Timothy Baldwin, Jey Han Lau, Trevor Cohn, Lawrence Cavedon, Damiano Spina, Karin Verspoor

This paper describes the submissions of the Natural Language Processing (NLP) team from the Australian Research Council Industrial Transformation Training Centre (ITTC) for Cognitive Computing in Medical Technologies to the TREC 2021 Clinical Trials Track.

Retrieval

Paper
Add Code

MPVNN: Mutated Pathway Visible Neural Network Architecture for Interpretable Prediction of Cancer-specific Survival Risk

1 code implementation • 2 Feb 2022 • Gourab Ghosh Roy, Nicholas Geard, Karin Verspoor, Shan He

We show that trained MPVNN architecture interpretation, which points to smaller sets of genes connected by signal flow within the PI3K-Akt pathway that are important in risk prediction for particular cancer types, is reliable.

Survival Analysis

Paper
Code

Large-scale protein-protein post-translational modification extraction with distant supervision and confidence calibrated BioBERT

1 code implementation • 6 Jan 2022 • Aparna Elangovan, Yuan Li, Douglas E. V. Pires, Melissa J. Davis, Karin Verspoor

However, by combining high confidence and low variation to identify high quality predictions, tuning the predictions for precision, we retained 19% of the test predictions with 100% precision.

Paper
Code

FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark

1 code implementation • Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) 2021 • Mingjie Li, Wenjia Cai, Rui Liu, Yuetian Weng, Xiaoyun Zhao, Cong Wang, Xin Chen, Zhong Liu, Caineng Pan, Mengke Li, Yizhi Liu, Flora D Salim, Karin Verspoor, Xiaodan Liang, Xiaojun Chang

Researchers have explored advanced methods from computer vision and natural language processing to incorporate medical domain knowledge for the generation of readable medical reports.

Medical Report Generation Text Generation

Paper
Code

Impact of detecting clinical trial elements in exploration of COVID-19 literature

no code implementations • 25 May 2021 • Simon Šuster, Karin Verspoor, Timothy Baldwin, Jey Han Lau, Antonio Jimeno Yepes, David Martinez, Yulia Otmakhova

The COVID-19 pandemic has driven ever-greater demand for tools which enable efficient exploration of biomedical literature.

Efficient Exploration PICO +1

Paper
Add Code

ChEMU-Ref: A Corpus for Modeling Anaphora Resolution in the Chemical Domain

1 code implementation • EACL 2021 • Biaoyan Fang, Christian Druckenbrodt, Saber A Akhondi, Jiayuan He, Timothy Baldwin, Karin Verspoor

Chemical patents contain rich coreference and bridging links, which are the target of this research.

Paper
Code

Memorization vs. Generalization : Quantifying Data Leakage in NLP Performance Evaluation

no code implementations • EACL 2021 • Aparna Elangovan, Jiayuan He, Karin Verspoor

Public datasets are often used to evaluate the efficacy and generalizability of state-of-the-art methods for many tasks in natural language processing (NLP).

Memorization named-entity-recognition +3

Paper
Add Code

Memorization vs. Generalization: Quantifying Data Leakage in NLP Performance Evaluation

1 code implementation • 3 Feb 2021 • Aparna Elangovan, Jiayuan He, Karin Verspoor

Public datasets are often used to evaluate the efficacy and generalizability of state-of-the-art methods for many tasks in natural language processing (NLP).

Memorization named-entity-recognition +3

Paper
Code

Assigning function to protein-protein interactions: a weakly supervised BioBERT based approach using PubMed abstracts

no code implementations • 20 Aug 2020 • Aparna Elangovan, Melissa Davis, Karin Verspoor

Motivation: Protein-protein interactions (PPI) are critical to the function of proteins in both normal and diseased cells, and many critical protein functions are mediated by interactions. Knowledge of the nature of these interactions is important for the construction of networks to analyse biological data.

Paper
Add Code

COVID-SEE: Scientific Evidence Explorer for COVID-19 Related Research

no code implementations • 18 Aug 2020 • Karin Verspoor, Simon Šuster, Yulia Otmakhova, Shevon Mendis, Zenan Zhai, Biaoyan Fang, Jey Han Lau, Timothy Baldwin, Antonio Jimeno Yepes, David Martinez

We present COVID-SEE, a system for medical literature discovery based on the concept of information exploration, which builds on several distinct text analysis and natural language processing methods to structure and organise information in publications, and augments search by providing a visual overview supporting exploration of a collection to identify key articles of interest.

Paper
Add Code

Evaluating the Utility of Model Configurations and Data Augmentation on Clinical Semantic Textual Similarity

no code implementations • WS 2020 • Yuxia Wang, Fei Liu, Karin Verspoor, Timothy Baldwin

In this paper, we apply pre-trained language models to the Semantic Textual Similarity (STS) task, with a specific focus on the clinical domain.

Data Augmentation Semantic Textual Similarity +1

Paper
Add Code

Domain Adaptation and Instance Selection for Disease Syndrome Classification over Veterinary Clinical Notes

no code implementations • WS 2020 • Brian Hur, Timothy Baldwin, Karin Verspoor, Laura Hardefeldt, James Gilkerson

Identifying the reasons for antibiotic administration in veterinary records is a critical component of understanding antimicrobial usage patterns.

Document Classification Domain Adaptation +1

Paper
Add Code

WikiUMLS: Aligning UMLS to Wikipedia via Cross-lingual Neural Ranking

1 code implementation • COLING 2020 • Afshin Rahimi, Timothy Baldwin, Karin Verspoor

We present our work on aligning the Unified Medical Language System (UMLS) to Wikipedia, to facilitate manual alignment of the two resources.

Paper
Code

SemEval-2017 Task 3: Community Question Answering

1 code implementation • SEMEVAL 2017 • Preslav Nakov, Doris Hoogeveen, Lluís Màrquez, Alessandro Moschitti, Hamdy Mubarak, Timothy Baldwin, Karin Verspoor

We describe SemEval-2017 Task 3 on Community Question Answering.

Community Question Answering Question Similarity

Paper
Code

Findings of the WMT 2019 Biomedical Translation Shared Task: Evaluation for MEDLINE Abstracts and Biomedical Terminologies

no code implementations • WS 2019 • Rachel Bawden, Kevin Bretonnel Cohen, Cristian Grozea, Antonio Jimeno Yepes, Madeleine Kittner, Martin Krallinger, Nancy Mah, Aurelie Neveol, Mariana Neves, Felipe Soares, Amy Siu, Karin Verspoor, Maika Vicente Navarro

In the fourth edition of the WMT Biomedical Translation task, we considered a total of six languages, namely Chinese (zh), English (en), French (fr), German (de), Portuguese (pt), and Spanish (es).

Translation

Paper
Add Code

Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings

1 code implementation • WS 2019 • Zenan Zhai, Dat Quoc Nguyen, Saber A. Akhondi, Camilo Thorne, Christian Druckenbrodt, Trevor Cohn, Michelle Gregory, Karin Verspoor

In this paper, we explore the NER performance of a BiLSTM-CRF model utilising pre-trained word embeddings, character-level word representations and contextualized ELMo word representations for chemical patents.

named-entity-recognition Named Entity Recognition +2

Paper
Code

A bag-of-concepts model improves relation extraction in a narrow knowledge domain with limited data

no code implementations • NAACL 2019 • Jiyu Chen, Karin Verspoor, Zenan Zhai

This paper focuses on a traditional relation extraction task in the context of limited annotated data and a narrow knowledge domain.

Feature Engineering Relation +2

Paper
Add Code

Detecting Chemical Reactions in Patents

no code implementations • ALTA 2019 • Hiyori Yoshikawa, Dat Quoc Nguyen, Zenan Zhai, Christian Druckenbrodt, Camilo Thorne, Saber A. Akhondi, Timothy Baldwin, Karin Verspoor

Extracting chemical reactions from patents is a crucial task for chemists working on chemical exploration.

Paper
Add Code

End-to-end neural relation extraction using deep biaffine attention

1 code implementation • 29 Dec 2018 • Dat Quoc Nguyen, Karin Verspoor

We propose a neural network model for joint extraction of named entities and relations between them, without any hand-crafted features.

Ranked #5 on Relation Extraction on CoNLL04

General Classification Relation +1

Paper
Code

Proceedings of the Third Conference on Machine Translation: Shared Task Papers

no code implementations • EMNLP 2018 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor

Machine Translation Translation

Paper
Add Code

Findings of the WMT 2018 Biomedical Translation Shared Task: Evaluation on Medline test sets

no code implementations • WS 2018 • Mariana Neves, Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Cristian Grozea, Amy Siu, Madeleine Kittner, Karin Verspoor

Machine translation enables the automatic translation of textual documents between languages and can facilitate access to information only available in a given language for non-speakers of this language, e. g. research results presented in scientific publications.

Machine Translation Translation

Paper
Add Code

Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition

no code implementations • WS 2018 • Zenan Zhai, Dat Quoc Nguyen, Karin Verspoor

We compare the use of LSTM-based and CNN-based character-level word embeddings in BiLSTM-CRF models to approach chemical and disease named entity recognition (NER) tasks.

named-entity-recognition Named Entity Recognition +2

Paper
Add Code

From POS tagging to dependency parsing for biomedical event extraction

2 code implementations • 11 Aug 2018 • Dat Quoc Nguyen, Karin Verspoor

Results: We perform an empirical study comparing state-of-the-art traditional feature-based and neural network-based models for two core natural language processing tasks of part-of-speech (POS) tagging and dependency parsing on two benchmark biomedical corpora, GENIA and CRAFT.

Ranked #1 on Dependency Parsing on GENIA - LAS

Dependency Parsing Event Extraction +3

Paper
Code

An improved neural network model for joint POS tagging and dependency parsing

1 code implementation • CONLL 2018 • Dat Quoc Nguyen, Karin Verspoor

We propose a novel neural network model for joint part-of-speech (POS) tagging and dependency parsing.

Ranked #15 on Dependency Parsing on Penn Treebank

Dependency Parsing Event Extraction +3

158

Paper
Code

Convolutional neural networks for chemical-disease relation extraction are improved with character-based word embeddings

no code implementations • WS 2018 • Dat Quoc Nguyen, Karin Verspoor

We investigate the incorporation of character-based word representations into a standard CNN-based relation extraction model.

Relation Relation Extraction +1

Paper
Add Code

Parallel Corpora for the Biomedical Domain

1 code implementation • LREC 2018 • Aur{\'e}lie N{\'e}v{\'e}ol, Antonio Jimeno Yepes, Mariana Neves, Karin Verspoor

Information Retrieval Machine Translation

Paper
Code

Automatic Negation and Speculation Detection in Veterinary Clinical Text

no code implementations • ALTA 2017 • Katherine Cheng, Timothy Baldwin, Karin Verspoor

Negation Speculation Detection

Paper
Add Code

Findings of the WMT 2017 Biomedical Translation Shared Task

no code implementations • WS 2017 • Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Karin Verspoor, Ond{\v{r}}ej Bojar, Arthur Boyer, Cristian Grozea, Barry Haddow, Madeleine Kittner, Yvonne Lichtblau, Pavel Pecina, Rol Roller, , Rudolf Rosa, Amy Siu, Philippe Thomas, Saskia Trescher

Machine Translation Translation

Paper
Add Code

ASM Kernel: Graph Kernel using Approximate Subgraph Matching for Relation Extraction

no code implementations • ALTA 2016 • Nagesh C. Panyam, Karin Verspoor, Trevor Cohn, Rao Kotagiri

Feature Engineering General Classification +3

Paper
Add Code

Syndromic Surveillance through Measuring Lexical Shift in Emergency Department Chief Complaint Texts

no code implementations • ALTA 2016 • Hafsah Aamer, Bahadorreza Ofoghi, Karin Verspoor

Paper
Add Code

SeeDev Binary Event Extraction using SVMs and a Rich Feature Set

1 code implementation • WS 2016 • Nagesh C. Panyam, Gitansh Khirbat, Karin Verspoor, Trevor Cohn, Kotagiri Ramamohanarao

Event Extraction Relation Extraction

Paper
Code

Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Christian Buck, Rajen Chatterjee, Christian Federmann, Liane Guillou, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Pavel Pecina, Martin Popel, Philipp Koehn, Christof Monz, Matteo Negri, Matt Post, Lucia Specia, Karin Verspoor, J{\"o}rg Tiedemann, Marco Turchi

Machine Translation Translation

Paper
Add Code

Findings of the 2016 Conference on Machine Translation

no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, Marcos Zampieri

Automatic Post-Editing Multimodal Machine Translation +1

Paper
Add Code

Rev at SemEval-2016 Task 2: Aligning Chunks by Lexical, Part of Speech and Semantic Equivalence

no code implementations • SEMEVAL 2016 • Ping Tan, Karin Verspoor, Timothy Miller

Semantic Textual Similarity Task 2

Paper
Add Code

Adjusting for Chance Clustering Comparison Measures

no code implementations • 3 Dec 2015 • Simone Romano, Nguyen Xuan Vinh, James Bailey, Karin Verspoor

In particular, the Adjusted Rand Index (ARI) based on pair-counting, and the Adjusted Mutual Information (AMI) based on Shannon information theory are very popular in the clustering community.

Clustering

Paper
Add Code

Structural Alignment as the Basis to Improve Significant Change Detection in Versioned Sentences

no code implementations • ALTA 2015 • Ping Ping Tan, Karin Verspoor, Tim Miller

Change Detection

Paper
Add Code

Domain Adaption of Named Entity Recognition to Support Credit Risk Assessment

no code implementations • ALTA 2015 • Julio Cesar Salinas Alvarado, Karin Verspoor, Timothy Baldwin

Domain Adaptation named-entity-recognition +2

Paper
Add Code

A Framework to Adjust Dependency Measure Estimates for Chance

no code implementations • 27 Oct 2015 • Simone Romano, Nguyen Xuan Vinh, James Bailey, Karin Verspoor

For example: non-linear dependencies between two continuous variables can be explored with the Maximal Information Coefficient (MIC); and categorical variables that are dependent to the target class are selected using Gini gain in random forests.