no code implementations • 27 Oct 2015 • Simone Romano, Nguyen Xuan Vinh, James Bailey, Karin Verspoor
For example: non-linear dependencies between two continuous variables can be explored with the Maximal Information Coefficient (MIC); and categorical variables that are dependent to the target class are selected using Gini gain in random forests.
no code implementations • 3 Dec 2015 • Simone Romano, Nguyen Xuan Vinh, James Bailey, Karin Verspoor
In particular, the Adjusted Rand Index (ARI) based on pair-counting, and the Adjusted Mutual Information (AMI) based on Shannon information theory are very popular in the clustering community.
no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, Marcos Zampieri
no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Christian Buck, Rajen Chatterjee, Christian Federmann, Liane Guillou, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Pavel Pecina, Martin Popel, Philipp Koehn, Christof Monz, Matteo Negri, Matt Post, Lucia Specia, Karin Verspoor, J{\"o}rg Tiedemann, Marco Turchi
no code implementations • WS 2017 • Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Karin Verspoor, Ond{\v{r}}ej Bojar, Arthur Boyer, Cristian Grozea, Barry Haddow, Madeleine Kittner, Yvonne Lichtblau, Pavel Pecina, Rol Roller, , Rudolf Rosa, Amy Siu, Philippe Thomas, Saskia Trescher
no code implementations • WS 2018 • Dat Quoc Nguyen, Karin Verspoor
We investigate the incorporation of character-based word representations into a standard CNN-based relation extraction model.
1 code implementation • CONLL 2018 • Dat Quoc Nguyen, Karin Verspoor
We propose a novel neural network model for joint part-of-speech (POS) tagging and dependency parsing.
Ranked #15 on Dependency Parsing on Penn Treebank
2 code implementations • 11 Aug 2018 • Dat Quoc Nguyen, Karin Verspoor
Results: We perform an empirical study comparing state-of-the-art traditional feature-based and neural network-based models for two core natural language processing tasks of part-of-speech (POS) tagging and dependency parsing on two benchmark biomedical corpora, GENIA and CRAFT.
Ranked #1 on Dependency Parsing on GENIA - LAS
no code implementations • WS 2018 • Zenan Zhai, Dat Quoc Nguyen, Karin Verspoor
We compare the use of LSTM-based and CNN-based character-level word embeddings in BiLSTM-CRF models to approach chemical and disease named entity recognition (NER) tasks.
no code implementations • WS 2018 • Mariana Neves, Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Cristian Grozea, Amy Siu, Madeleine Kittner, Karin Verspoor
Machine translation enables the automatic translation of textual documents between languages and can facilitate access to information only available in a given language for non-speakers of this language, e. g. research results presented in scientific publications.
no code implementations • EMNLP 2018 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor
1 code implementation • 29 Dec 2018 • Dat Quoc Nguyen, Karin Verspoor
We propose a neural network model for joint extraction of named entities and relations between them, without any hand-crafted features.
Ranked #5 on Relation Extraction on CoNLL04
no code implementations • ALTA 2019 • Hiyori Yoshikawa, Dat Quoc Nguyen, Zenan Zhai, Christian Druckenbrodt, Camilo Thorne, Saber A. Akhondi, Timothy Baldwin, Karin Verspoor
Extracting chemical reactions from patents is a crucial task for chemists working on chemical exploration.
no code implementations • NAACL 2019 • Jiyu Chen, Karin Verspoor, Zenan Zhai
This paper focuses on a traditional relation extraction task in the context of limited annotated data and a narrow knowledge domain.
1 code implementation • WS 2019 • Zenan Zhai, Dat Quoc Nguyen, Saber A. Akhondi, Camilo Thorne, Christian Druckenbrodt, Trevor Cohn, Michelle Gregory, Karin Verspoor
In this paper, we explore the NER performance of a BiLSTM-CRF model utilising pre-trained word embeddings, character-level word representations and contextualized ELMo word representations for chemical patents.
no code implementations • WS 2019 • Rachel Bawden, Kevin Bretonnel Cohen, Cristian Grozea, Antonio Jimeno Yepes, Madeleine Kittner, Martin Krallinger, Nancy Mah, Aurelie Neveol, Mariana Neves, Felipe Soares, Amy Siu, Karin Verspoor, Maika Vicente Navarro
In the fourth edition of the WMT Biomedical Translation task, we considered a total of six languages, namely Chinese (zh), English (en), French (fr), German (de), Portuguese (pt), and Spanish (es).
1 code implementation • SEMEVAL 2017 • Preslav Nakov, Doris Hoogeveen, Lluís Màrquez, Alessandro Moschitti, Hamdy Mubarak, Timothy Baldwin, Karin Verspoor
We describe SemEval-2017 Task 3 on Community Question Answering.
1 code implementation • COLING 2020 • Afshin Rahimi, Timothy Baldwin, Karin Verspoor
We present our work on aligning the Unified Medical Language System (UMLS) to Wikipedia, to facilitate manual alignment of the two resources.
no code implementations • WS 2020 • Yuxia Wang, Fei Liu, Karin Verspoor, Timothy Baldwin
In this paper, we apply pre-trained language models to the Semantic Textual Similarity (STS) task, with a specific focus on the clinical domain.
no code implementations • WS 2020 • Brian Hur, Timothy Baldwin, Karin Verspoor, Laura Hardefeldt, James Gilkerson
Identifying the reasons for antibiotic administration in veterinary records is a critical component of understanding antimicrobial usage patterns.
no code implementations • 18 Aug 2020 • Karin Verspoor, Simon Šuster, Yulia Otmakhova, Shevon Mendis, Zenan Zhai, Biaoyan Fang, Jey Han Lau, Timothy Baldwin, Antonio Jimeno Yepes, David Martinez
We present COVID-SEE, a system for medical literature discovery based on the concept of information exploration, which builds on several distinct text analysis and natural language processing methods to structure and organise information in publications, and augments search by providing a visual overview supporting exploration of a collection to identify key articles of interest.
no code implementations • 20 Aug 2020 • Aparna Elangovan, Melissa Davis, Karin Verspoor
Motivation: Protein-protein interactions (PPI) are critical to the function of proteins in both normal and diseased cells, and many critical protein functions are mediated by interactions. Knowledge of the nature of these interactions is important for the construction of networks to analyse biological data.
1 code implementation • 3 Feb 2021 • Aparna Elangovan, Jiayuan He, Karin Verspoor
Public datasets are often used to evaluate the efficacy and generalizability of state-of-the-art methods for many tasks in natural language processing (NLP).
no code implementations • EACL 2021 • Aparna Elangovan, Jiayuan He, Karin Verspoor
Public datasets are often used to evaluate the efficacy and generalizability of state-of-the-art methods for many tasks in natural language processing (NLP).
1 code implementation • EACL 2021 • Biaoyan Fang, Christian Druckenbrodt, Saber A Akhondi, Jiayuan He, Timothy Baldwin, Karin Verspoor
Chemical patents contain rich coreference and bridging links, which are the target of this research.
no code implementations • 25 May 2021 • Simon Šuster, Karin Verspoor, Timothy Baldwin, Jey Han Lau, Antonio Jimeno Yepes, David Martinez, Yulia Otmakhova
The COVID-19 pandemic has driven ever-greater demand for tools which enable efficient exploration of biomedical literature.
1 code implementation • Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) 2021 • Mingjie Li, Wenjia Cai, Rui Liu, Yuetian Weng, Xiaoyun Zhao, Cong Wang, Xin Chen, Zhong Liu, Caineng Pan, Mengke Li, Yizhi Liu, Flora D Salim, Karin Verspoor, Xiaodan Liang, Xiaojun Chang
Researchers have explored advanced methods from computer vision and natural language processing to incorporate medical domain knowledge for the generation of readable medical reports.
1 code implementation • 6 Jan 2022 • Aparna Elangovan, Yuan Li, Douglas E. V. Pires, Melissa J. Davis, Karin Verspoor
However, by combining high confidence and low variation to identify high quality predictions, tuning the predictions for precision, we retained 19% of the test predictions with 100% precision.
1 code implementation • 2 Feb 2022 • Gourab Ghosh Roy, Nicholas Geard, Karin Verspoor, Shan He
We show that trained MPVNN architecture interpretation, which points to smaller sets of genes connected by signal flow within the PI3K-Akt pathway that are important in risk prediction for particular cancer types, is reliable.
no code implementations • 16 Feb 2022 • Thinh Hung Truong, Yulia Otmakhova, Rahmad Mahendra, Timothy Baldwin, Jey Han Lau, Trevor Cohn, Lawrence Cavedon, Damiano Spina, Karin Verspoor
This paper describes the submissions of the Natural Language Processing (NLP) team from the Australian Research Council Industrial Transformation Training Centre (ITTC) for Cognitive Computing in Medical Technologies to the TREC 2021 Clinical Trials Track.
no code implementations • NAACL 2022 • Thinh Hung Truong, Timothy Baldwin, Trevor Cohn, Karin Verspoor
Negation is a common linguistic feature that is crucial in many language understanding tasks, yet it remains a hard problem due to diversity in its expression in different types of text.
no code implementations • CVPR 2022 • Mingjie Li, Wenjia Cai, Karin Verspoor, Shirui Pan, Xiaodan Liang, Xiaojun Chang
To endow models with the capability of incorporating expert knowledge, we propose a Cross-modal clinical Graph Transformer (CGT) for ophthalmic report generation (ORG), in which clinical relation triples are injected into the visual features as prior knowledge to drive the decoding procedure.
2 code implementations • sdp (COLING) 2022 • Yulia Otmakhova, Hung Thinh Truong, Timothy Baldwin, Trevor Cohn, Karin Verspoor, Jey Han Lau
In this paper we report on our submission to the Multidocument Summarisation for Literature Review (MSLR) shared task.
1 code implementation • 6 Oct 2022 • Thinh Hung Truong, Yulia Otmakhova, Timothy Baldwin, Trevor Cohn, Jey Han Lau, Karin Verspoor
Negation is poorly captured by current language models, although the extent of this problem is not widely understood.
no code implementations • 26 Jan 2023 • Jinghui Liu, Daniel Capurro, Anthony Nguyen, Karin Verspoor
In this study, we propose to treat this neglected text as privileged information available during training to enhance early prediction modeling through knowledge distillation, presented as Learning using Privileged tIme-sEries Text (LuPIET).
1 code implementation • 14 Jun 2023 • Thinh Hung Truong, Timothy Baldwin, Karin Verspoor, Trevor Cohn
Negation has been shown to be a major bottleneck for masked language models, such as BERT.
1 code implementation • 8 Aug 2023 • Yuxia Wang, Shimin Tao, Ning Xie, Hao Yang, Timothy Baldwin, Karin Verspoor
Despite the subjective nature of semantic textual similarity (STS) and pervasive disagreements in STS annotation, existing benchmarks have used averaged human ratings as the gold standard.
no code implementations • 12 Oct 2023 • Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor
BERT-based models have had strong performance on leaderboards, yet have been demonstrably worse in real-world settings requiring generalization.
no code implementations • 7 Nov 2023 • Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor
In clinical research generalizability depends on (a) internal validity of experiments to ensure controlled measurement of cause and effect, and (b) external validity or transportability of the results to the wider population.
no code implementations • 15 Jan 2024 • Mingjie Li, Karin Verspoor
Information extraction techniques, including named entity recognition (NER) and relation extraction (RE), are crucial in many domains to support making sense of vast amounts of unstructured text data by identifying and connecting relevant information.
no code implementations • 6 Feb 2024 • Huiling Tu, Shuo Yu, Vidya Saikrishna, Feng Xia, Karin Verspoor
Knowledge graphs (KGs) have garnered significant attention for their vast potential across diverse domains.
no code implementations • SMM4H (COLING) 2022 • Antonio Jimeno Yepes, Karin Verspoor
We describe the work of the READ-BioMed team for the preparation of a submission to the SocialDisNER Disease Named Entity Recognition (NER) Task (Task 10) in 2022.
no code implementations • EMNLP (ClinicalNLP) 2020 • Yuxia Wang, Karin Verspoor, Timothy Baldwin
Domain pretraining followed by task fine-tuning has become the standard paradigm for NLP tasks, but requires in-domain labelled data for task fine-tuning.
no code implementations • EMNLP (NLP-COVID19) 2020 • Yulia Otmakhova, Karin Verspoor, Timothy Baldwin, Simon Šuster
Efficient discovery and exploration of biomedical literature has grown in importance in the context of the COVID-19 pandemic, and topic-based methods such as latent Dirichlet allocation (LDA) are a useful tool for this purpose.
no code implementations • ALTA 2021 • Antonio Jimeno Yepes, Ameer Albahem, Karin Verspoor
In developing systems to identify focus entities in scientific literature, we face the problem of discriminating key entities of interest from other potentially relevant entities of the same type mentioned in the articles.
1 code implementation • Findings (ACL) 2022 • Biaoyan Fang, Timothy Baldwin, Karin Verspoor
Procedural text contains rich anaphoric phenomena, yet has not received much attention in NLP.
1 code implementation • COLING 2022 • Yuxia Wang, Timothy Baldwin, Karin Verspoor
Training with noisy labelled data is known to be detrimental to model performance, especially for high-capacity neural network models in low-resource domains.
no code implementations • ACL 2022 • Yulia Otmakhova, Karin Verspoor, Timothy Baldwin, Jey Han Lau
Although multi-document summarisation (MDS) of the biomedical literature is a highly valuable task that has recently attracted substantial interest, evaluation of the quality of biomedical summaries lacks consistency and transparency.
no code implementations • NAACL (SIGTYP) 2022 • Yulia Otmakhova, Karin Verspoor, Jey Han Lau
Though recently there have been an increased interest in how pre-trained language models encode different linguistic features, there is still a lack of systematic comparison between languages with different morphology and syntax.