Search Results for author: Goran Nenadic

Found 34 papers, 18 papers with code

CantonMT: Cantonese to English NMT Platform with Fine-Tuned Models Using Synthetic Back-Translation Data

1 code implementation17 Mar 2024 Kung Yin Hong, Lifeng Han, Riza Batista-Navarro, Goran Nenadic

We present the models we fine-tuned using the limited amount of real data and the synthetic data we generated using back-translation including OpusMT, NLLB, and mBART.

Data Augmentation Machine Translation +2

Neural Machine Translation of Clinical Text: An Empirical Investigation into Multilingual Pre-Trained Language Models and Transfer-Learning

1 code implementation12 Dec 2023 Lifeng Han, Serge Gladkoff, Gleb Erofeev, Irina Sorokina, Betty Galiano, Goran Nenadic

Furthermore, to address the language resource imbalance issue, we also carry out experiments using a transfer learning methodology based on massive multilingual pre-trained language models (MMPLMs).

Clinical Knowledge Language Modelling +3

Generating Medical Prescriptions with Conditional Transformer

1 code implementation30 Oct 2023 Samuel Belkadi, Nicolo Micheletti, Lifeng Han, Warren Del-Pinto, Goran Nenadic

LT3 is trained on a set of around 2K lines of medication prescriptions extracted from the MIMIC-III database, allowing the model to produce valuable synthetic medication prescriptions.

2k Language Modelling +3

Extraction of Medication and Temporal Relation from Clinical Text using Neural Language Models

no code implementations3 Oct 2023 Hangyu Tu, Lifeng Han, Goran Nenadic

Furthermore, we also designed a set of post-processing roles to generate structured output on medications and the temporal relation.

Avg Disease Prediction +5

Investigating Large Language Models and Control Mechanisms to Improve Text Readability of Biomedical Abstracts

1 code implementation22 Sep 2023 Zihao Li, Samuel Belkadi, Nicolo Micheletti, Lifeng Han, Matthew Shardlow, Goran Nenadic

In this work, we investigate the ability of state-of-the-art large language models (LLMs) on the task of biomedical abstract simplification, using the publicly available dataset for plain language adaptation of biomedical abstracts (\textbf{PLABA}).

Text Simplification

MC-DRE: Multi-Aspect Cross Integration for Drug Event/Entity Extraction

1 code implementation12 Aug 2023 Jie Yang, Soyeon Caren Han, Siqu Long, Josiah Poon, Goran Nenadic

Extracting meaningful drug-related information chunks, such as adverse drug events (ADE), is crucial for preventing morbidity and saving many lives.

Event Detection Event Extraction +4

MedMine: Examining Pre-trained Language Models on Medication Mining

1 code implementation7 Aug 2023 Haifa Alrdahi, Lifeng Han, Hendrik Šuvalov, Goran Nenadic

Automatic medication mining from clinical and biomedical text has become a popular topic due to its real impact on healthcare applications and the recent development of powerful language models (LMs).

Data Augmentation Ensemble Learning +2

Do You Hear The People Sing? Key Point Analysis via Iterative Clustering and Abstractive Summarisation

no code implementations25 May 2023 Hao Li, Viktor Schlegel, Riza Batista-Navarro, Goran Nenadic

Furthermore, evaluating key points is crucial in ensuring that the automatically generated summaries are useful.

Sentence

Student's t-Distribution: On Measuring the Inter-Rater Reliability When the Observations are Scarce

no code implementations8 Mar 2023 Serge Gladkoff, Lifeng Han, Goran Nenadic

Then, this leads to our example with two human-generated observational scores, for which, we introduce ``Student's \textit{t}-Distribution'' method and explain how to use it to measure the IRR score using only these two data points, as well as the confidence intervals (CIs) of the quality evaluation.

Translation

Topic Modelling of Swedish Newspaper Articles about Coronavirus: a Case Study using Latent Dirichlet Allocation Method

2 code implementations8 Jan 2023 Bernadeta Griciūtė, Lifeng Han, Goran Nenadic

In this study, from the social-media and healthcare domain, we apply popular Latent Dirichlet Allocation (LDA) methods to model the topic changes in Swedish newspaper articles about Coronavirus.

Natural Language Understanding

Exploring the Value of Pre-trained Language Models for Clinical Named Entity Recognition

2 code implementations23 Oct 2022 Samuel Belkadi, Lifeng Han, Yuping Wu, Goran Nenadic

The experimental outcomes show that 1) CRF layers improved all language models; 2) referring to BIO-strict span level evaluation using macro-average F1 score, although the fine-tuned LLMs achieved 0. 83+ scores, the TransformerCRF model trained from scratch achieved 0. 78+, demonstrating comparable performances with much lower cost - e. g. with 39. 80\% less training parameters; 3) referring to BIO-strict span-level evaluation using weighted-average F1 score, ClinicalBERT-CRF, BERT-CRF, and TransformerCRF exhibited lower score differences, with 97. 59\%/97. 44\%/96. 84\% respectively.

Language Modelling named-entity-recognition +1

Investigating Massive Multilingual Pre-Trained Machine Translation Models for Clinical Domain via Transfer Learning

no code implementations12 Oct 2022 Lifeng Han, Gleb Erofeev, Irina Sorokina, Serge Gladkoff, Goran Nenadic

To the best of our knowledge, this is the first work on using MMPLMs towards \textit{clinical domain transfer-learning NMT} successfully for totally unseen languages during pre-training.

Machine Translation NMT +3

EDU-level Extractive Summarization with Varying Summary Lengths

1 code implementation8 Oct 2022 Yuping Wu, Ching-Hsun Tseng, Jiayu Shang, Shengzhong Mao, Goran Nenadic, Xiao-jun Zeng

To fill these gaps, this paper first conducts the comparison analysis of oracle summaries based on EDUs and sentences, which provides evidence from both theoretical and experimental perspectives to justify and quantify that EDUs make summaries with higher automatic evaluation scores than sentences.

Extractive Summarization Text Summarization

Examining Large Pre-Trained Language Models for Machine Translation: What You Don't Know About It

no code implementations15 Sep 2022 Lifeng Han, Gleb Erofeev, Irina Sorokina, Serge Gladkoff, Goran Nenadic

Pre-trained language models (PLMs) often take advantage of the monolingual and multilingual dataset that is freely available online to acquire general or mixed domain knowledge before deployment into specific tasks.

Machine Translation

Semantics Altering Modifications for Evaluating Comprehension in Machine Reading

1 code implementation7 Dec 2020 Viktor Schlegel, Goran Nenadic, Riza Batista-Navarro

Advances in NLP have yielded impressive results for the task of machine reading comprehension (MRC), with approaches having been reported to achieve performance comparable to that of humans.

Machine Reading Comprehension Sentence

An efficient representation of chronological events in medical texts

no code implementations EMNLP (Louhi) 2020 Andrey Kormilitzin, Nemanja Vaci, Qiang Liu, Hao Ni, Goran Nenadic, Alejo Nevado-Holgado

In this work we addressed the problem of capturing sequential information contained in longitudinal electronic health records (EHRs).

Beyond Leaderboards: A survey of methods for revealing weaknesses in Natural Language Inference data and models

no code implementations29 May 2020 Viktor Schlegel, Goran Nenadic, Riza Batista-Navarro

Recent years have seen a growing number of publications that analyse Natural Language Inference (NLI) datasets for superficial cues, whether they undermine the complexity of the tasks underlying those datasets and how they impact those models that are optimised and evaluated on this data.

Natural Language Inference

GNTeam at 2018 n2c2: Feature-augmented BiLSTM-CRF for drug-related entity recognition in hospital discharge summaries

1 code implementation23 Sep 2019 Maksim Belousov, Nikola Milosevic, Ghada Alfattni, Haifa Alrdahi, Goran Nenadic

The recurrent neural networks that use the pre-trained domain-specific word embeddings and a CRF layer for label optimization perform drug, adverse event and related entities extraction with micro-averaged F1-score of over 91%.

Entity Extraction using GAN named-entity-recognition +3

MedNorm: A Corpus and Embeddings for Cross-terminology Medical Concept Normalisation

1 code implementation WS 2019 Maksim Belousov, William G. Dixon, Goran Nenadic

The medical concept normalisation task aims to map textual descriptions to standard terminologies such as SNOMED-CT or MedDRA.

Representation Learning

Extracting adverse drug reactions and their context using sequence labelling ensembles in TAC2017

no code implementations28 May 2019 Maksim Belousov, Nikola Milosevic, William Dixon, Goran Nenadic

Adverse drug reactions (ADRs) are unwanted or harmful effects experienced after the administration of a certain drug or a combination of drugs, presenting a challenge for drug development and drug administration.

From web crawled text to project descriptions: automatic summarizing of social innovation projects

no code implementations22 May 2019 Nikola Milosevic, Dimitar Marinov, Abdullah Gok, Goran Nenadic

In the past decade, social innovation projects have gained the attention of policy makers, as they address important social issues in an innovative manner.

A framework for information extraction from tables in biomedical literature

1 code implementation26 Feb 2019 Nikola Milosevic, Cassie Gregson, Robert Hernandez, Goran Nenadic

The scientific literature is growing exponentially, and professionals are no more able to cope with the current amount of publications.

Table Detection

Creating a contemporary corpus of similes in Serbian by using natural language processing

no code implementations22 Nov 2018 Nikola Milosevic, Goran Nenadic

Simile is a figure of speech that compares two things through the use of connection words, but where comparison is not intended to be taken literally.

As Cool as a Cucumber: Towards a Corpus of Contemporary Similes in Serbian

1 code implementation20 May 2016 Nikola Milosevic, Goran Nenadic

Similes are natural language expressions used to compare unlikely things, where the comparison is not taken literally.

ManTIME: Temporal expression identification and normalization in the TempEval-3 challenge

no code implementations SEMEVAL 2013 Michele Filannino, Gavin Brown, Goran Nenadic

This paper describes a temporal expression identification and normalization system, ManTIME, developed for the TempEval-3 challenge.

Attribute

LINNAEUS: A species name identification system for biomedical literature

no code implementations BMC Bioinformatics 2010 Martin Gerner, Goran Nenadic, Casey M Bergman

In this paper we describe an open-source species name recognition and normalization software system, LINNAEUS, and evaluate its performance relative to several automatically generated biomedical corpora, as well as a novel corpus of full-text documents manually annotated for species mentions.

Named Entity Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.