Measuring semantic similarity of clinical trial outcomes using deep pre-trained language representations

Journal of Biomedical Informatics: X 2019 β€’ Anna Korolevaa β€’ Sanjay Kamatha β€’ Patrick Paroubeka

Background: Outcomes are variables monitored during a clinical trial to assess the impact of an intervention on humans’ health.Automatic assessment of semantic similarity of trial outcomes is required for a number of tasks, such as detection of outcome switching (unjustified changes of pre-defined outcomes of a trial) and implementation of Core Outcome Sets (minimal sets of outcomes that should be reported in a particular medical domain). Objective: We aimed at building an algorithm for assessing semantic similarity of pairs of primary and reported outcomes.We focused on approaches that do not require manually curated domain-specific resources such as ontologies and thesauri... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK LEADERBOARD
Semantic Similarity Annotated corpus for semantic similarity of clinical trial outcomes (expanded corpus) BERT-Base uncased (fine-tuned on "Annotated corpus for semantic similarity of clinical trial outcomes, expanded corpus") F1 89.16 # 4
Precision 89.31 # 3
Recall 89.12 # 5
Semantic Similarity Annotated corpus for semantic similarity of clinical trial outcomes (expanded corpus) BERT-Base cased (fine-tuned on "Annotated corpus for semantic similarity of clinical trial outcomes, expanded corpus") F1 89.12 # 5
Precision 88.25 # 5
Recall 90.1 # 4
Semantic Similarity Annotated corpus for semantic similarity of clinical trial outcomes (expanded corpus) SciBERT uncased (SciVocab, fine-tuned on "Annotated corpus for semantic similarity of clinical trial outcomes, expanded corpus") F1 91.51 # 2
Precision 91.3 # 2
Recall 91.79 # 3
Semantic Similarity Annotated corpus for semantic similarity of clinical trial outcomes (expanded corpus) SciBERT cased (SciVocab, fine-tuned on "Annotated corpus for semantic similarity of clinical trial outcomes, expanded corpus") F1 90.69 # 3
Precision 89 # 4
Recall 92.54 # 2
Semantic Similarity Annotated corpus for semantic similarity of clinical trial outcomes (expanded corpus) BioBERT (pre-trained on PubMed abstracts + PMC, fine-tuned on "Annotated corpus for semantic similarity of clinical trial outcomes, expanded corpus") F1 93.38 # 1
Precision 92.98 # 1
Recall 93.85 # 1
Semantic Similarity Annotated corpus for semantic similarity of clinical trial outcomes (original corpus) SciBERT cased (SciVocab, fine-tuned on "Annotated corpus for semantic similarity of clinical trial outcomes, original corpus") F1 89.3 # 2
Precision 87.31 # 3
Recall 91.53 # 1
Semantic Similarity Annotated corpus for semantic similarity of clinical trial outcomes (original corpus) SciBERT uncased (SciVocab, fine-tuned on "Annotated corpus for semantic similarity of clinical trial outcomes, original corpus") F1 89.3 # 2
Precision 87.99 # 2
Recall 90.78 # 2
Semantic Similarity Annotated corpus for semantic similarity of clinical trial outcomes (original corpus) BERT-Base uncased (fine-tuned on "Annotated corpus for semantic similarity of clinical trial outcomes, original corpus") F1 86.8 # 3
Precision 85.76 # 4
Recall 88.15 # 4
Semantic Similarity Annotated corpus for semantic similarity of clinical trial outcomes (original corpus) BERT-Base cased (fine-tuned on "Annotated corpus for semantic similarity of clinical trial outcomes, original corpus") F1 84.21 # 4
Precision 83.36 # 5
Recall 85.2 # 5
Semantic Similarity Annotated corpus for semantic similarity of clinical trial outcomes (original corpus) BioBERT (pre-trained on PubMed abstracts + PMC, fine-tuned on "Annotated corpus for semantic similarity of clinical trial outcomes, original corpus") F1 89.75 # 1
Precision 88.93 # 1
Recall 90.76 # 3