The BLUE benchmark consists of five different biomedicine text-mining tasks with ten corpora. These tasks cover a diverse range of text genres (biomedical literature and clinical notes), dataset sizes, and degrees of difficulty and, more importantly, highlight common biomedicine text-mining challenges.
74 PAPERS • NO BENCHMARKS YET
Kvasir-SEG is an open-access dataset of gastrointestinal polyp images and corresponding segmentation masks, manually annotated by a medical doctor and then verified by an experienced gastroenterologist.
53 PAPERS • 3 BENCHMARKS
BioGRID is a biomedical interaction repository with data compiled through comprehensive curation efforts. The current index is version 4.2.192 and searches 75,868 publications for 1,997,840 protein and genetic interactions, 29,093 chemical interactions and 959,750 post translational modifications from major model organism species.
32 PAPERS • 2 BENCHMARKS
Contains hundreds of frontal view X-rays and is the largest public resource for COVID-19 image and prognostic data, making it a necessary resource to develop and evaluate tools to aid in the treatment of COVID-19.
30 PAPERS • NO BENCHMARKS YET
The goal of the Automated Cardiac Diagnosis Challenge (ACDC) challenge is to:
15 PAPERS • 3 BENCHMARKS
ChemProt consists of 1,820 PubMed abstracts with chemical-protein interactions annotated by domain experts and was used in the BioCreative VI text mining chemical-protein interactions shared task.
14 PAPERS • 1 BENCHMARK
The National Institutes of Health’s Clinical Center has made a large-scale dataset of CT images publicly available to help the scientific community improve detection accuracy of lesions. While most publicly available medical image datasets have less than a thousand lesions, this dataset, named DeepLesion, has over 32,000 annotated lesions (220GB) identified on CT images. DeepLesion, a dataset with 32,735 lesions in 32,120 CT slices from 10,594 studies of 4,427 unique patients. There are a variety of lesion types in this dataset, such as lung nodules, liver tumors, enlarged lymph nodes, and so on. It has the potential to be used in various medical image applications
12 PAPERS • 1 BENCHMARK
BLURB is a collection of resources for biomedical natural language processing. In general domains such as newswire and the Web, comprehensive benchmarks and leaderboards such as GLUE have greatly accelerated progress in open-domain NLP. In biomedicine, however, such resources are ostensibly scarce. In the past, there have been a plethora of shared tasks in biomedical NLP, such as BioCreative, BioNLP Shared Tasks, SemEval, and BioASQ, to name just a few. These efforts have played a significant role in fueling interest and progress by the research community, but they typically focus on individual tasks. The advent of neural language models such as BERTs provides a unifying foundation to leverage transfer learning from unlabeled text to support a wide range of NLP applications. To accelerate progress in biomedical pretraining strategies and task-specific methods, it is thus imperative to create a broad-coverage benchmark encompassing diverse biomedical tasks.
9 PAPERS • 2 BENCHMARKS
Diabetic retinopathy is the leading cause of blindness in the working-age population of the developed world. It is estimated to affect over 93 million people.
7 PAPERS • 1 BENCHMARK
Consists of annotated frames containing GI procedure tools such as snares, balloons and biopsy forceps, etc. Beside of the images, the dataset includes ground truth masks and bounding boxes and has been verified by two expert GI endoscopists.
7 PAPERS • 2 BENCHMARKS
ATOM3D is a unified collection of datasets concerning the three-dimensional structure of biomolecules, including proteins, small molecules, and nucleic acids. These datasets are specifically designed to provide a benchmark for machine learning methods which operate on 3D molecular structure, and represent a variety of important structural, functional, and engineering tasks. All datasets are provided in a standardized format along with a Python package containing processing code, utilities, models, and dataloaders for common machine learning frameworks such as PyTorch. ATOM3D is designed to be a living database, where datasets are updated and tasks are added as the field progresses.
6 PAPERS • NO BENCHMARKS YET
BioLAMA is a benchmark comprised of 49K biomedical factual knowledge triples for probing biomedical Language Models. It is used to assess the capabilities of Language Models for being valid biomedical knowledge bases.
5 PAPERS • NO BENCHMARKS YET
SICAPv2 is a database containing prostate histology whole slide images with both annotations of global Gleason scores and path-level Gleason grades.
CHAOS challenge aims the segmentation of abdominal organs (liver, kidneys and spleen) from CT and MRI data. ONsite section of the CHAOS was held in The IEEE International Symposium on Biomedical Imaging (ISBI) on April 11, 2019, Venice, ITALY. Online submissions are still welcome!
4 PAPERS • NO BENCHMARKS YET
Indian Diabetic Retinopathy Image Dataset (IDRiD) dataset consists of typical diabetic retinopathy lesions and normal retinal structures annotated at a pixel level. This dataset also provides information on the disease severity of diabetic retinopathy and diabetic macular edema for each image. This dataset is perfect for the development and evaluation of image analysis algorithms for early detection of diabetic retinopathy.
4 PAPERS • 2 BENCHMARKS
The 2017 PhysioNet/CinC Challenge aims to encourage the development of algorithms to classify, from a single short ECG lead recording (between 30 s and 60 s in length), whether the recording shows normal sinus rhythm, atrial fibrillation (AF), an alternative rhythm, or is too noisy to be classified.
3 PAPERS • NO BENCHMARKS YET
EHR-RelB is a benchmark dataset for biomedical concept relatedness, consisting of 3630 concept pairs sampled from electronic health records (EHRs). EHR-RelA is a smaller dataset of 111 concept pairs, which are mainly unrelated.
The Kvasir-SEG dataset includes 196 polyps smaller than 10 mm classified as Paris class 1 sessile or Paris class IIa. We have selected it with the help of expert gastroenterologists. We have released this dataset separately as a subset of Kvasir-SEG. We call this subset Kvasir-Sessile.
3 PAPERS • 1 BENCHMARK
The dataset contains a Video capsule endoscopy dataset for polyp segmentation.
Abstract Lobachevsky University Electrocardiography Database (LUDB) is an ECG signal database with marked boundaries and peaks of P, T waves and QRS complexes. The database consists of 200 10-second 12-lead ECG signal records representing different morphologies of the ECG signal. The ECGs were collected from healthy volunteers and patients of the Nizhny Novgorod City Hospital No 5 in 2017–2018. The patients had various cardiovascular diseases while some of them had pacemakers. The boundaries of P, T waves and QRS complexes were manually annotated by cardiologists for all 200 records. Also, each record is annotated with the corresponding diagnosis. The database can be used for educational purposes as well as for training and testing algorithms for ECG delineation, i.e. for automatic detection of boundaries and peaks of P, T waves and QRS complexes.
The “Medico automatic polyp segmentation challenge” aims to develop computer-aided diagnosis systems for automatic polyp segmentation to detect all types of polyps (for example, irregular polyp, smaller or flat polyps) with high efficiency and accuracy. The main goal of the challenge is to benchmark semantic segmentation algorithms on a publicly available dataset, emphasizing robustness, speed, and generalization.
UI-PRMD is a data set of movements related to common exercises performed by patients in physical therapy and rehabilitation programs. The data set consists of 10 rehabilitation exercises. A sample of 10 healthy individuals repeated each exercise 10 times in front of two sensory systems for motion capturing: a Vicon optical tracker, and a Kinect camera. The data is presented as positions and angles of the body joints in the skeletal models provided by the Vicon and Kinect mocap systems.
MICCAI Challenge on Circuit Reconstruction from Electron Microscopy Images.
2 PAPERS • 1 BENCHMARK
Dataset contains 33,010 molecule-description pairs split into 80\%/10\%/10\% train/val/test splits. The goal of the task is to retrieve the relevant molecule for a natural language description. It is defined as follows:
2 PAPERS • 3 BENCHMARKS
How and where proteins interface with one another can ultimately impact the proteins' functions along with a range of other biological processes. As such, precise computational methods for protein interface prediction (PIP) come highly sought after as they could yield significant advances in drug discovery and design as well as protein function analysis. However, the traditional benchmark dataset for this task, Docking Benchmark 5 (DB5), contains only a paltry 230 complexes for training, validating, and testing different machine learning algorithms. In this work, we expand on a dataset recently introduced for this task, the Database of Interacting Protein Structures (DIPS), to present DIPS-Plus, an enhanced, feature-rich dataset of 42,112 complexes for geometric deep learning of protein interfaces. The previous version of DIPS contains only the Cartesian coordinates and types of the atoms comprising a given protein complex, whereas DIPS-Plus now includes a plethora of new residue-level
2 PAPERS • NO BENCHMARKS YET
The NuCLS dataset contains over 220,000 labeled nuclei from breast cancer images from TCGA. These nuclei were annotated through the collaborative effort of pathologists, pathology residents, and medical students using the Digital Slide Archive. These data can be used in several ways to develop and validate algorithms for nuclear detection, classification, and segmentation, or as a resource to develop and evaluate methods for interrater analysis.
Overview This database of simulated arterial pulse waves is designed to be representative of a sample of pulse waves measured from healthy adults. It contains pulse waves for 4,374 virtual subjects, aged from 25-75 years old (in 10 year increments). The database contains a baseline set of pulse waves for each of the six age groups, created using cardiovascular properties (such as heart rate and arterial stiffness) which are representative of healthy subjects at each age group. It also contains 728 further virtual subjects at each age group, in which each of the cardiovascular properties are varied within normal ranges. This allows for extensive in silico analyses of haemodynamics and the performance of pulse wave analysis algorithms.
This mouse cerebellar atlas can be used for mouse cerebellar morphometry.
The eSports Sensors dataset contains sensor data collected from 10 players in 22 matches in League of Legends. The sensor data collected includes:
2 PAPERS • 2 BENCHMARKS
By releasing this dataset, we aim at providing a new testbed for computer vision techniques using Deep Learning. The main peculiarity is the shift from the domain of "natural images" proper of common benchmark dataset to biological imaging. We anticipate that the advantages of doing so could be two-fold: i) fostering research in biomedical-related fields - for which popular pre-trained models perform typically poorly - and ii) promoting methodological research in deep learning by addressing peculiar requirements of these images. Possible applications include but are not limited to semantic segmentation, object detection and object counting. The data consist of 283 high-resolution pictures (1600x1200 pixels) of mice brain slices acquired through a fluorescence microscope. The final goal is to individuate and count neurons highlighted in the pictures by means of a marker, so to assess the result of a biological experiment. The corresponding ground-truth labels were generated through a hy
The BCSS dataset contains over 20,000 segmentation annotations of tissue regions from breast cancer images from The Cancer Genome Atlas (TCGA). This large-scale dataset was annotated through the collaborative effort of pathologists, pathology residents, and medical students using the Digital Slide Archive. It enables the generation of highly accurate machine-learning models for tissue segmentation.
1 PAPER • NO BENCHMARKS YET
BioLeaflets is a biomedical dataset for Data2Text generation. It is a corpus of 1,336 package leaflets of medicines authorised in Europe, which were obtained by scraping the European Medicines Agency (EMA) website. Package leaflets are included in the packaging of medicinal products and contain information to help patients use the product safely and appropriately, under the guidance of their healthcare professional. Each document contains six sections: 1) What is the product and what is it used for 2) What you need to know before you take the product 3) product usage instructions 4) possible side effects, 5) product storage conditions 6) other information.
Several datasets are fostering innovation in higher-level functions for everyone, everywhere. By providing this repository, we hope to encourage the research community to focus on hard problems. In this repository, we present the real results severity (BIRADS) and pathology (post-report) classifications provided by the Radiologist Director from the Radiology Department of Hospital Fernando Fonseca while diagnosing several patients (see dataset-uta4-dicom) from our User Tests and Analysis 4 (UTA4) study. Here, we provide a dataset for the measurements of both severity (BIRADS) and pathology classifications concerning the patient diagnostic. Work and results are published on a top Human-Computer Interaction (HCI) conference named AVI 2020 (page). Results were analyzed and interpreted from our Statistical Analysis charts. The user tests were made in clinical institutions, where clinicians diagnose several patients for a Single-Modality vs Multi-Modality comparison. For example, in these t
Several datasets are fostering innovation in higher-level functions for everyone, everywhere. By providing this repository, we hope to encourage the research community to focus on hard problems. In this repository, we present our medical imaging DICOM files of patients from our User Tests and Analysis 4 (UTA4) study. Here, we provide a dataset of the used medical images during the UTA4 tasks. This repository and respective dataset should be paired with the dataset-uta4-rates repository dataset. Work and results are published on a top Human-Computer Interaction (HCI) conference named AVI 2020 (page). Results were analyzed and interpreted on our Statistical Analysis charts. The user tests were made in clinical institutions, where clinicians diagnose several patients for a Single-Modality vs Multi-Modality comparison. For example, in these tests, we used both prototype-single-modality and prototype-multi-modality repositories for the comparison. On the same hand, the hereby dataset repres
1 PAPER • 1 BENCHMARK
Several datasets are fostering innovation in higher-level functions for everyone, everywhere. By providing this repository, we hope to encourage the research community to focus on hard problems. In this repository, we present our severity rates (BIRADS) of clinicians while diagnosing several patients from our User Tests and Analysis 4 (UTA4) study. Here, we provide a dataset for the measurements of severity rates (BIRADS) concerning the patient diagnostic. Work and results are published on a top Human-Computer Interaction (HCI) conference named AVI 2020 (page). Results were analyzed and interpreted from our Statistical Analysis charts. The user tests were made in clinical institutions, where clinicians diagnose several patients for a Single-Modality vs Multi-Modality comparison. For example, in these tests, we used both prototype-single-modality and prototype-multi-modality repositories for the comparison. On the same hand, the hereby dataset represents the pieces of information of bot
The complete blood count (CBC) dataset contains 360 blood smear images along with their annotation files splitting into Training, Testing, and Validation sets. The training folder contains 300 images with annotations. The testing and validation folder both contain 60 images with annotations. We have done some modifications over the original dataset to prepare this CBC dataset where some of the image annotation files contain very low red blood cells (RBCs) than actual and one annotation file does not include any RBC at all although the cell smear image contains RBCs. So, we clear up all the fallacious files and split the dataset into three parts. Among the 360 smear images, 300 blood cell images with annotations are used as the training set first, and then the rest of the 60 images with annotations are used as the testing set. Due to the shortage of data, a subset of the training set is used to prepare the validation set which contains 60 images with annotations.
This is a set of 100,000 non-overlapping image patches from hematoxylin & eosin (H&E) stained histological images of human colorectal cancer (CRC) and normal tissue. All images are 224x224 pixels (px) at 0.5 microns per pixel (MPP). For tissue classification; the classes are: Adipose (ADI), background (BACK), debris (DEB), lymphocytes (LYM), mucus (MUC), smooth muscle (MUS), normal colon mucosa (NORM), cancer-associated stroma (STR), colorectal adenocarcinoma epithelium (TUM). The images were manually extracted from N=86 H&E stained human cancer tissue slides from formalin-fixed paraffin-embedded (FFPE) samples from the NCT Biobank (National Center for Tumor Diseases, Heidelberg, Germany) and the UMM pathology archive (University Medical Center Mannheim, Mannheim, Germany). Tissue samples contained CRC primary tumor slides and tumor tissue from CRC liver metastases; normal tissue classes were augmented with non-tumorous regions from gastrectomy specimen to increase variability.
CoVERT is a fact-checked corpus of tweets with a focus on the domain of biomedicine and COVID-19-related (mis)information. The corpus consists of 300 tweets, each annotated with medical named entities and relations. Employs a novel crowdsourcing methodology to annotate all tweets with fact-checking labels and supporting evidence, which crowdworkers search for online. This methodology results in moderate inter-annotator agreement.
This collection contains data and code associated with the IPCAI/IJCARS 2020 paper “Automatic Annotation of Hip Anatomy in Fluoroscopy for Robust and Efficient 2D/3D Registration.” The data hosted here consists of annotated datasets of actual hip fluoroscopy, CT and derived data from six lower torso cadaveric specimens. Documentation and examples for using the dataset and Python code for training and testing the proposed models are also included. Higher-level information, including clinical motivations, prior works, algorithmic details, applications to 2D/3D registration, and experimental details, may be found in the companion paper which is available at https://arxiv.org/abs/1911.07042 or https://doi.org/10.1007/s11548-020-02162-7. We hope that this code and data will be useful in the development of new computer-assisted capabilities that leverage fluoroscopy.
The data presented here was extracted from a larger dataset collected through a collaboration between the Embedded Systems Laboratory (ESL) of the Swiss Federal Institute of Technology in Lausanne (EPFL), Switzerland and the Institute of Sports Sciences of the University of Lausanne (ISSUL). In this dataset, we report the extracted segments used for an analysis of R peak detection algorithms during high intensity exercise.
A challenge that consists of three tasks, each targeting a different requirement for in-clinic use. The first task involves classifying images from the GI tract into 23 distinct classes. The second task focuses on efficiant classification measured by the amount of time spent processing each image. The last task relates to automatcially segmenting polyps.
This is the supplemental data for our paper on how to benchmark registrations of serial sections with ground truths. There are three main modalities and one further, as a reference.
The dataset X of this work is an extension of the heartSeg dataset. Each sample x ∈ X is an RGB image capturing the heart region of Medaka (Oryzias latipes) hatchlings from a constant ventral view. Since the body of Medaka is see-through, noninvasive studies regarding the internal organs and the whole circulatory system are practicable. A Medaka’s heart contains three parts: the atrium, the ventricle, and the bulbus. The atrium receives deoxygenated blood from the circulatory system and delivers it to the ventricle, which forwards it into the bulbus. The bulbus is the heart’s exit chamber and provides the gill arches with a constant blood flow. The blood flow through these three chambers was captured in 63 short recordings (around 11 seconds with 24 frames per second each) in total, from which the single image samples x ∈ X are extracted. The dataset is split into training and test data following the heartSeg dataset with ntrain = 565 samples in the training set Xtrain and ntest = 165
The dataset contains full-spectral autofluorescence lifetime microscopic images (FS-FLIM) acquired on unstained ex-vivo human lung tissue, where 100 4D hypercubes of 256x256 (spatial resolution) x 32 (time bins) x 512 (spectral channels from 500nm to 780nm). This dataset associates with our paper "Deep Learning-Assisted Co-registration of Full-Spectral Autofluorescence Lifetime Microscopic Images with H&E-Stained Histology Images" (https://arxiv.org/abs/2202.07755) and "Full spectrum fluorescence lifetime imaging with 0.5 nm spectral and 50 ps temporal resolution" (https://doi.org/10.1038/s41467-021-26837-0). The FS-FLIM images provide transformative insights into human lung cancer with extra-dimensional information. This will enable visual and precise detection of early lung cancer. With the methodology in our co-registration paper, FS-FLIM images can be registered with H&E-stained histology images, allowing characterisation of tumour and surrounding cells at a celluar level with abs
HyperKvasir dataset contains 110,079 images and 374 videos where it captures anatomical landmarks and pathological and normal findings. A total of around 1 million images and video frames altogether.
Kvasir-Capsule dataset is the largest publicly released VCE dataset. In total, the dataset contains 47,238 labeled images and 117 videos, where it captures anatomical landmarks and pathological and normal findings. The results is more than 4,741,621 images and video frames altogether.
The LIVECell (Label-free In Vitro image Examples of Cells) dataset is a large-scale microscopic image dataset for instance-segmentation of individual cells in 2D cell cultures.
the MTHS dataset contains 30Hz PPG signals obtained from 62 patients, including 35 men and 27 women. The ground truth data includes heart rate and oxygen saturation levels sampled at 1Hz. The HR and SPo2 measurement is obtained using a pulse oximeter (M70). An iPhone 5s was used to obtain the ppg recordings at 30 fps.
1 PAPER • 2 BENCHMARKS