Search Results for author: Goran Nenadic

Found 34 papers, 18 papers with code

CantonMT: Cantonese to English NMT Platform with Fine-Tuned Models Using Synthetic Back-Translation Data

1 code implementation • 17 Mar 2024 • Kung Yin Hong, Lifeng Han, Riza Batista-Navarro, Goran Nenadic

We present the models we fine-tuned using the limited amount of real data and the synthetic data we generated using back-translation including OpusMT, NLLB, and mBART.

Data Augmentation Machine Translation +2

Paper
Code

Neural Machine Translation of Clinical Text: An Empirical Investigation into Multilingual Pre-Trained Language Models and Transfer-Learning

1 code implementation • 12 Dec 2023 • Lifeng Han, Serge Gladkoff, Gleb Erofeev, Irina Sorokina, Betty Galiano, Goran Nenadic

Furthermore, to address the language resource imbalance issue, we also carry out experiments using a transfer learning methodology based on massive multilingual pre-trained language models (MMPLMs).

Clinical Knowledge Language Modelling +3

Paper
Code

Exploring the Consistency, Quality and Challenges in Manual and Automated Coding of Free-text Diagnoses from Hospital Outpatient Letters

no code implementations • 17 Nov 2023 • Warren Del-Pinto, George Demetriou, Meghna Jani, Rikesh Patel, Leanne Gray, Alex Bulcock, Niels Peek, Andrew S. Kanter, William G Dixon, Goran Nenadic

A gold standard was constructed by a panel of clinicians from a subset of the annotated diagnoses.

Paper
Add Code

Generating Medical Prescriptions with Conditional Transformer

1 code implementation • 30 Oct 2023 • Samuel Belkadi, Nicolo Micheletti, Lifeng Han, Warren Del-Pinto, Goran Nenadic

LT3 is trained on a set of around 2K lines of medication prescriptions extracted from the MIMIC-III database, allowing the model to produce valuable synthetic medication prescriptions.

2k Language Modelling +3

Paper
Code

Extraction of Medication and Temporal Relation from Clinical Text using Neural Language Models

no code implementations • 3 Oct 2023 • Hangyu Tu, Lifeng Han, Goran Nenadic

Furthermore, we also designed a set of post-processing roles to generate structured output on medications and the temporal relation.

Avg Disease Prediction +5

Paper
Add Code

Investigating Large Language Models and Control Mechanisms to Improve Text Readability of Biomedical Abstracts

1 code implementation • 22 Sep 2023 • Zihao Li, Samuel Belkadi, Nicolo Micheletti, Lifeng Han, Matthew Shardlow, Goran Nenadic

In this work, we investigate the ability of state-of-the-art large language models (LLMs) on the task of biomedical abstract simplification, using the publicly available dataset for plain language adaptation of biomedical abstracts (\textbf{PLABA}).

Text Simplification

Paper
Code

MC-DRE: Multi-Aspect Cross Integration for Drug Event/Entity Extraction

1 code implementation • 12 Aug 2023 • Jie Yang, Soyeon Caren Han, Siqu Long, Josiah Poon, Goran Nenadic

Extracting meaningful drug-related information chunks, such as adverse drug events (ADE), is crucial for preventing morbidity and saving many lives.

Event Detection Event Extraction +4

Paper
Code

MedMine: Examining Pre-trained Language Models on Medication Mining

1 code implementation • 7 Aug 2023 • Haifa Alrdahi, Lifeng Han, Hendrik Šuvalov, Goran Nenadic

Automatic medication mining from clinical and biomedical text has become a popular topic due to its real impact on healthcare applications and the recent development of powerful language models (LMs).

Data Augmentation Ensemble Learning +2

Paper
Code

Predictive Data Analytics with AI: assessing the need for post-editing of MT output by fine-tuning OpenAI LLMs

no code implementations • 31 Jul 2023 • Serge Gladkoff, Gleb Erofeev, Irina Sorokina, Lifeng Han, Goran Nenadic

Translation Quality Evaluation (TQE) is an essential step of the modern translation production process.

Binary Classification Machine Translation +1

Paper
Add Code

PULSAR at MEDIQA-Sum 2023: Large Language Models Augmented by Synthetic Dialogue Convert Patient Dialogues to Medical Records

1 code implementation • 5 Jul 2023 • Viktor Schlegel, Hao Li, Yuping Wu, Anand Subramanian, Thanh-Tung Nguyen, Abhinav Ramesh Kashyap, Daniel Beck, Xiaojun Zeng, Riza Theresa Batista-Navarro, Stefan Winkler, Goran Nenadic

This paper describes PULSAR, our system submission at the ImageClef 2023 MediQA-Sum task on summarising patient-doctor dialogues into clinical records.

Data Augmentation Language Modelling

Paper
Code

PULSAR: Pre-training with Extracted Healthcare Terms for Summarising Patients' Problems and Data Augmentation with Black-box Large Language Models

1 code implementation • 5 Jun 2023 • Hao Li, Yuping Wu, Viktor Schlegel, Riza Batista-Navarro, Thanh-Tung Nguyen, Abhinav Ramesh Kashyap, Xiaojun Zeng, Daniel Beck, Stefan Winkler, Goran Nenadic

Medical progress notes play a crucial role in documenting a patient's hospital journey, including his or her condition, treatment plan, and any updates for healthcare providers.

Data Augmentation

Paper
Code

Do You Hear The People Sing? Key Point Analysis via Iterative Clustering and Abstractive Summarisation

no code implementations • 25 May 2023 • Hao Li, Viktor Schlegel, Riza Batista-Navarro, Goran Nenadic

Furthermore, evaluating key points is crucial in ensuring that the automatically generated summaries are useful.

Sentence

Paper
Add Code

Student's t-Distribution: On Measuring the Inter-Rater Reliability When the Observations are Scarce

no code implementations • 8 Mar 2023 • Serge Gladkoff, Lifeng Han, Goran Nenadic

Then, this leads to our example with two human-generated observational scores, for which, we introduce ``Student's \textit{t}-Distribution'' method and explain how to use it to measure the IRR score using only these two data points, as well as the confidence intervals (CIs) of the quality evaluation.

Translation

Paper
Add Code

Topic Modelling of Swedish Newspaper Articles about Coronavirus: a Case Study using Latent Dirichlet Allocation Method

2 code implementations • 8 Jan 2023 • Bernadeta Griciūtė, Lifeng Han, Goran Nenadic

In this study, from the social-media and healthcare domain, we apply popular Latent Dirichlet Allocation (LDA) methods to model the topic changes in Swedish newspaper articles about Coronavirus.

Natural Language Understanding

Paper
Code

Exploring the Value of Pre-trained Language Models for Clinical Named Entity Recognition

2 code implementations • 23 Oct 2022 • Samuel Belkadi, Lifeng Han, Yuping Wu, Goran Nenadic

The experimental outcomes show that 1) CRF layers improved all language models; 2) referring to BIO-strict span level evaluation using macro-average F1 score, although the fine-tuned LLMs achieved 0. 83+ scores, the TransformerCRF model trained from scratch achieved 0. 78+, demonstrating comparable performances with much lower cost - e. g. with 39. 80\% less training parameters; 3) referring to BIO-strict span-level evaluation using weighted-average F1 score, ClinicalBERT-CRF, BERT-CRF, and TransformerCRF exhibited lower score differences, with 97. 59\%/97. 44\%/96. 84\% respectively.

Language Modelling named-entity-recognition +1

Paper
Code

Investigating Massive Multilingual Pre-Trained Machine Translation Models for Clinical Domain via Transfer Learning

no code implementations • 12 Oct 2022 • Lifeng Han, Gleb Erofeev, Irina Sorokina, Serge Gladkoff, Goran Nenadic

To the best of our knowledge, this is the first work on using MMPLMs towards \textit{clinical domain transfer-learning NMT} successfully for totally unseen languages during pre-training.

Machine Translation NMT +3

Paper
Add Code

EDU-level Extractive Summarization with Varying Summary Lengths

1 code implementation • 8 Oct 2022 • Yuping Wu, Ching-Hsun Tseng, Jiayu Shang, Shengzhong Mao, Goran Nenadic, Xiao-jun Zeng

To fill these gaps, this paper first conducts the comparison analysis of oracle summaries based on EDUs and sentences, which provides evidence from both theoretical and experimental perspectives to justify and quantify that EDUs make summaries with higher automatic evaluation scores than sentences.

Extractive Summarization Text Summarization

Paper
Code

Examining Large Pre-Trained Language Models for Machine Translation: What You Don't Know About It

no code implementations • 15 Sep 2022 • Lifeng Han, Gleb Erofeev, Irina Sorokina, Serge Gladkoff, Goran Nenadic

Pre-trained language models (PLMs) often take advantage of the monolingual and multilingual dataset that is freely available online to acquire general or mixed domain knowledge before deployment into specific tasks.

Machine Translation

Paper
Add Code

Semantics Altering Modifications for Evaluating Comprehension in Machine Reading

1 code implementation • 7 Dec 2020 • Viktor Schlegel, Goran Nenadic, Riza Batista-Navarro

Advances in NLP have yielded impressive results for the task of machine reading comprehension (MRC), with approaches having been reported to achieve performance comparable to that of humans.

Machine Reading Comprehension Sentence

Paper
Code

An efficient representation of chronological events in medical texts

no code implementations • EMNLP (Louhi) 2020 • Andrey Kormilitzin, Nemanja Vaci, Qiang Liu, Hao Ni, Goran Nenadic, Alejo Nevado-Holgado

In this work we addressed the problem of capturing sequential information contained in longitudinal electronic health records (EHRs).

Paper
Add Code

Beyond Leaderboards: A survey of methods for revealing weaknesses in Natural Language Inference data and models

no code implementations • 29 May 2020 • Viktor Schlegel, Goran Nenadic, Riza Batista-Navarro

Recent years have seen a growing number of publications that analyse Natural Language Inference (NLI) datasets for superficial cues, whether they undermine the complexity of the tasks underlying those datasets and how they impact those models that are optimised and evaluated on this data.

Natural Language Inference

Paper
Add Code

MASK: A flexible framework to facilitate de-identification of clinical texts

1 code implementation • 24 May 2020 • Nikola Milosevic, Gangamma Kalappa, Hesam Dadafarin, Mahmoud Azimaee, Goran Nenadic

The software is able to perform named entity recognition using some of the state-of-the-art techniques and then mask or redact recognized entities.

Ranked #1 on Named Entity Recognition (NER) on i2b2 De-identification Dataset

De-identification named-entity-recognition +2

Paper
Code

A Framework for Evaluation of Machine Reading Comprehension Gold Standards

1 code implementation • LREC 2020 • Viktor Schlegel, Marco Valentino, André Freitas, Goran Nenadic, Riza Batista-Navarro

Machine Reading Comprehension (MRC) is the task of answering a question over a paragraph of text.

Machine Reading Comprehension

Paper
Code

GNTeam at 2018 n2c2: Feature-augmented BiLSTM-CRF for drug-related entity recognition in hospital discharge summaries

1 code implementation • 23 Sep 2019 • Maksim Belousov, Nikola Milosevic, Ghada Alfattni, Haifa Alrdahi, Goran Nenadic

The recurrent neural networks that use the pre-trained domain-specific word embeddings and a CRF layer for label optimization perform drug, adverse event and related entities extraction with micro-averaged F1-score of over 91%.

Entity Extraction using GAN named-entity-recognition +3

Paper
Code

MedNorm: A Corpus and Embeddings for Cross-terminology Medical Concept Normalisation

1 code implementation • WS 2019 • Maksim Belousov, William G. Dixon, Goran Nenadic

The medical concept normalisation task aims to map textual descriptions to standard terminologies such as SNOMED-CT or MedDRA.

Representation Learning

Paper
Code

Extracting adverse drug reactions and their context using sequence labelling ensembles in TAC2017

no code implementations • 28 May 2019 • Maksim Belousov, Nikola Milosevic, William Dixon, Goran Nenadic

Adverse drug reactions (ADRs) are unwanted or harmful effects experienced after the administration of a certain drug or a combination of drugs, presenting a challenge for drug development and drug administration.

Paper
Add Code

From web crawled text to project descriptions: automatic summarizing of social innovation projects

no code implementations • 22 May 2019 • Nikola Milosevic, Dimitar Marinov, Abdullah Gok, Goran Nenadic

In the past decade, social innovation projects have gained the attention of policy makers, as they address important social issues in an innovative manner.

Paper
Add Code

A framework for information extraction from tables in biomedical literature

1 code implementation • 26 Feb 2019 • Nikola Milosevic, Cassie Gregson, Robert Hernandez, Goran Nenadic

The scientific literature is growing exponentially, and professionals are no more able to cope with the current amount of publications.

Table Detection

Paper
Code

Creating a contemporary corpus of similes in Serbian by using natural language processing

no code implementations • 22 Nov 2018 • Nikola Milosevic, Goran Nenadic

Simile is a figure of speech that compares two things through the use of connection words, but where comparison is not intended to be taken literally.

Paper
Add Code

Inferring Methodological Meta-knowledge from Large Biomedical Corpora

no code implementations • PACLIC 2016 • Goran Nenadic

Temporal Information Extraction

Paper
Add Code

As Cool as a Cucumber: Towards a Corpus of Contemporary Similes in Serbian

1 code implementation • 20 May 2016 • Nikola Milosevic, Goran Nenadic

Similes are natural language expressions used to compare unlikely things, where the comparison is not taken literally.

Paper
Code

Mining temporal footprints from Wikipedia

no code implementations • WS 2014 • Michele Filannino, Goran Nenadic

Question Answering Temporal Information Extraction

Paper
Add Code

ManTIME: Temporal expression identification and normalization in the TempEval-3 challenge

no code implementations • SEMEVAL 2013 • Michele Filannino, Gavin Brown, Goran Nenadic

This paper describes a temporal expression identification and normalization system, ManTIME, developed for the TempEval-3 challenge.

Attribute

Paper
Add Code

LINNAEUS: A species name identification system for biomedical literature

no code implementations • BMC Bioinformatics 2010 • Martin Gerner, Goran Nenadic, Casey M Bergman

In this paper we describe an open-source species name recognition and normalization software system, LINNAEUS, and evaluate its performance relative to several automatically generated biomedical corpora, as well as a novel corpus of full-text documents manually annotated for species mentions.

Named Entity Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.