Search Results for author: Lifeng Han

Found 25 papers, 15 papers with code

CantonMT: Cantonese to English NMT Platform with Fine-Tuned Models Using Synthetic Back-Translation Data

1 code implementation17 Mar 2024 Kung Yin Hong, Lifeng Han, Riza Batista-Navarro, Goran Nenadic

We present the models we fine-tuned using the limited amount of real data and the synthetic data we generated using back-translation including OpusMT, NLLB, and mBART.

Data Augmentation Machine Translation +2

Neural Machine Translation of Clinical Text: An Empirical Investigation into Multilingual Pre-Trained Language Models and Transfer-Learning

1 code implementation12 Dec 2023 Lifeng Han, Serge Gladkoff, Gleb Erofeev, Irina Sorokina, Betty Galiano, Goran Nenadic

Furthermore, to address the language resource imbalance issue, we also carry out experiments using a transfer learning methodology based on massive multilingual pre-trained language models (MMPLMs).

Clinical Knowledge Language Modelling +3

Generating Medical Prescriptions with Conditional Transformer

1 code implementation30 Oct 2023 Samuel Belkadi, Nicolo Micheletti, Lifeng Han, Warren Del-Pinto, Goran Nenadic

LT3 is trained on a set of around 2K lines of medication prescriptions extracted from the MIMIC-III database, allowing the model to produce valuable synthetic medication prescriptions.

Language Modelling named-entity-recognition +2

Extraction of Medication and Temporal Relation from Clinical Text using Neural Language Models

no code implementations3 Oct 2023 Hangyu Tu, Lifeng Han, Goran Nenadic

Furthermore, we also designed a set of post-processing roles to generate structured output on medications and the temporal relation.

Avg Disease Prediction +5

Investigating Large Language Models and Control Mechanisms to Improve Text Readability of Biomedical Abstracts

1 code implementation22 Sep 2023 Zihao Li, Samuel Belkadi, Nicolo Micheletti, Lifeng Han, Matthew Shardlow, Goran Nenadic

In this work, we investigate the ability of state-of-the-art large language models (LLMs) on the task of biomedical abstract simplification, using the publicly available dataset for plain language adaptation of biomedical abstracts (\textbf{PLABA}).

Text Simplification

MedMine: Examining Pre-trained Language Models on Medication Mining

1 code implementation7 Aug 2023 Haifa Alrdahi, Lifeng Han, Hendrik Šuvalov, Goran Nenadic

Automatic medication mining from clinical and biomedical text has become a popular topic due to its real impact on healthcare applications and the recent development of powerful language models (LMs).

Data Augmentation Ensemble Learning +2

Student's t-Distribution: On Measuring the Inter-Rater Reliability When the Observations are Scarce

no code implementations8 Mar 2023 Serge Gladkoff, Lifeng Han, Goran Nenadic

Then, this leads to our example with two human-generated observational scores, for which, we introduce ``Student's \textit{t}-Distribution'' method and explain how to use it to measure the IRR score using only these two data points, as well as the confidence intervals (CIs) of the quality evaluation.

Translation

Topic Modelling of Swedish Newspaper Articles about Coronavirus: a Case Study using Latent Dirichlet Allocation Method

2 code implementations8 Jan 2023 Bernadeta Griciūtė, Lifeng Han, Goran Nenadic

In this study, from the social-media and healthcare domain, we apply popular Latent Dirichlet Allocation (LDA) methods to model the topic changes in Swedish newspaper articles about Coronavirus.

Natural Language Understanding

HilMeMe: A Human-in-the-Loop Machine Translation Evaluation Metric Looking into Multi-Word Expressions

no code implementations9 Nov 2022 Lifeng Han

With the fast development of Machine Translation (MT) systems, especially the new boost from Neural MT (NMT) models, the MT output quality has reached a new level of accuracy.

Machine Translation NMT +1

Exploring the Value of Pre-trained Language Models for Clinical Named Entity Recognition

2 code implementations23 Oct 2022 Samuel Belkadi, Lifeng Han, Yuping Wu, Goran Nenadic

The experimental outcomes show that 1) CRF layers improved all language models; 2) referring to BIO-strict span level evaluation using macro-average F1 score, although the fine-tuned LLMs achieved 0. 83+ scores, the TransformerCRF model trained from scratch achieved 0. 78+, demonstrating comparable performances with much lower cost - e. g. with 39. 80\% less training parameters; 3) referring to BIO-strict span-level evaluation using weighted-average F1 score, ClinicalBERT-CRF, BERT-CRF, and TransformerCRF exhibited lower score differences, with 97. 59\%/97. 44\%/96. 84\% respectively.

Language Modelling named-entity-recognition +1

Investigating Massive Multilingual Pre-Trained Machine Translation Models for Clinical Domain via Transfer Learning

no code implementations12 Oct 2022 Lifeng Han, Gleb Erofeev, Irina Sorokina, Serge Gladkoff, Goran Nenadic

To the best of our knowledge, this is the first work on using MMPLMs towards \textit{clinical domain transfer-learning NMT} successfully for totally unseen languages during pre-training.

Machine Translation NMT +3

Examining Large Pre-Trained Language Models for Machine Translation: What You Don't Know About It

no code implementations15 Sep 2022 Lifeng Han, Gleb Erofeev, Irina Sorokina, Serge Gladkoff, Goran Nenadic

Pre-trained language models (PLMs) often take advantage of the monolingual and multilingual dataset that is freely available online to acquire general or mixed domain knowledge before deployment into specific tasks.

Machine Translation

An Overview on Machine Translation Evaluation

no code implementations22 Feb 2022 Lifeng Han

Manual evaluation and automatic evaluation include reference-translation based and reference-translation independent participation; automatic evaluation methods include traditional n-gram string matching, models applying syntax and semantics, and deep learning models; evaluation of evaluation methods includes estimating the credibility of human evaluations, the reliability of the automatic evaluation, the reliability of the test set, etc.

Machine Translation Translation

HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professional Post-Editing Towards More Effective MT Evaluation

1 code implementation LREC 2022 Serge Gladkoff, Lifeng Han

The initial experimental work carried out on English-Russian language pair MT outputs on marketing content type of text from highly technical domain reveals that our evaluation framework is quite effective in reflecting the MT output quality regarding both overall system-level performance and segment-level transparency, and it increases the IRR for error type interpretation.

Machine Translation Marketing +1

Measuring Uncertainty in Translation Quality Evaluation (TQE)

no code implementations LREC 2022 Serge Gladkoff, Irina Sorokina, Lifeng Han, Alexandra Alekseeva

From both human translators (HT) and machine translation (MT) researchers' point of view, translation quality evaluation (TQE) is an essential task.

Machine Translation Translation

cushLEPOR: customising hLEPOR metric using Optuna for higher agreement with human judgments or pre-trained language model LaBSE

1 code implementation WMT (EMNLP) 2021 Lifeng Han, Irina Sorokina, Gleb Erofeev, Serge Gladkoff

Then we present the customised hLEPOR (cushLEPOR) which uses Optuna hyper-parameter optimisation framework to fine-tune hLEPOR weighting parameters towards better agreement to pre-trained language models (using LaBSE) regarding the exact MT language pairs that cushLEPOR is deployed to.

Language Modelling

Translation Quality Assessment: A Brief Survey on Manual and Automatic Methods

1 code implementation MoTra (NoDaLiDa) 2021 Lifeng Han, Gareth J. F. Jones, Alan F. Smeaton

To facilitate effective translation modeling and translation studies, one of the crucial questions to address is how to assess translation quality.

Machine Translation Natural Language Understanding +3

Chinese Character Decomposition for Neural MT with Multi-Word Expressions

1 code implementation NoDaLiDa 2021 Lifeng Han, Gareth J. F. Jones, Alan F. Smeaton, Paolo Bolzoni

To investigate the impact of Chinese decomposition embedding in detail, i. e., radical, stroke, and intermediate levels, and how well these decompositions represent the meaning of the original character sequences, we carry out analysis with both automated and human evaluation of MT.

Machine Translation Translation

AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations

1 code implementation COLING (MWE) 2020 Lifeng Han, Gareth Jones, Alan Smeaton

To facilitate further MT research, we present a categorisation of the error types encountered by MT systems in performing MWE related translation.

Machine Translation Sentence +1

Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than Character Level

1 code implementation3 May 2018 Lifeng Han, Shaohui Kuang

We integrate the Chinese radicals into the NMT model with different settings to address the unseen words challenge in Chinese to English translation.

Machine Translation NMT +1

LEPOR: An Augmented Machine Translation Evaluation Metric

1 code implementation26 Mar 2017 Lifeng Han

Finally, we introduce the practical performance of our metrics in the ACL-WMT workshop shared tasks, which show that the proposed methods are robust across different languages.

Machine Translation POS +1

Machine Translation Evaluation Resources and Methods: A Survey

no code implementations15 May 2016 Lifeng Han

Subsequently, we also introduce the evaluation methods for MT evaluation including different correlation scores, and the recent quality estimation (QE) tasks for MT.

Informativeness Machine Translation +4

Cannot find the paper you are looking for? You can Submit a new open access paper.