QM9 provides quantum chemical properties for a relevant, consistent, and comprehensive chemical space of small organic molecules. This database may serve the benchmarking of existing methods, development of new methods, such as hybrid quantum mechanics/machine learning, and systematic identification of structure-property relationships.
37 PAPERS • 4 BENCHMARKS
The Tox21 data set comprises 12,060 training samples and 647 test samples that represent chemical compounds. There are 801 "dense features" that represent chemical descriptors, such as molecular weight, solubility or surface area, and 272,776 "sparse features" that represent chemical substructures (ECFP10, DFS6, DFS8; stored in Matrix Market Format ). Machine learning methods can either use sparse or dense data or combine them. For each sample there are 12 binary labels that represent the outcome (active/inactive) of 12 different toxicological experiments. Note that the label matrix contains many missing values (NAs). The original data source and Tox21 challenge site is https://tripod.nih.gov/tox21/challenge/.
21 PAPERS • 5 BENCHMARKS
QED is a linguistically principled framework for explanations in question answering. Given a question and a passage, QED represents an explanation of the answer as a combination of discrete, human-interpretable steps: sentence selection := identification of a sentence implying an answer to the question referential equality := identification of noun phrases in the question and the answer sentence that refer to the same thing predicate entailment := confirmation that the predicate in the sentence entails the predicate in the question once referential equalities are abstracted away. The QED dataset is an expert-annotated dataset of QED explanations build upon a subset of the Google Natural Questions dataset.
11 PAPERS • 1 BENCHMARK
SIDER contains information on marketed medicines and their recorded adverse drug reactions. The information is extracted from public documents and package inserts. The available information include side effect frequency, drug and side effect classifications as well as links to further information, for example drug–target relations.
10 PAPERS • 4 BENCHMARKS
The AIDS Antiviral Screen dataset is a dataset of screens checking tens of thousands of compounds for evidence of anti-HIV activity. The available screen results are chemical graph-structured data of these various compounds.
9 PAPERS • 3 BENCHMARKS
BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of protein considered to be drug-targets with small, drug-like molecules. As of May 27, 2022, BindingDB contains 41,296 Entries, each with a DOI, containing 2,519,702 binding data for 8,810 protein targets and 1,080,101 small molecules. There are 5,988 protein-ligand crystal structures with BindingDB affinity measurements for proteins with 100% sequence identity, and 11,442 crystal structures allowing proteins to 85% sequence identity.You can also use BindingDB data through the Registry of Open Data on AWS: https://registry.opendata.aws/binding-db. This dataset using the split by TransformerCPI(doi.org/10.1093/bioinformatics/btaa524)
6 PAPERS • 1 BENCHMARK
Comparative evaluation of virtual screening methods requires a rigorous benchmarking procedure on diverse, realistic, and unbiased data sets. Recent investigations from numerous research groups unambiguously demonstrate that artificially constructed ligand sets classically used by the community (e.g., DUD, DUD-E, MUV) are unfortunately biased by both obvious and hidden chemical biases, therefore overestimating the true accuracy of virtual screening methods. We herewith present a novel data set (LIT-PCBA) specifically designed for virtual screening and machine learning. LIT-PCBA relies on 149 dose–response PubChem bioassays that were additionally processed to remove false positives and assay artifacts and keep active and inactive compounds within similar molecular property ranges. To ascertain that the data set is suited to both ligand-based and structure-based virtual screening, target sets were restricted to single protein targets for which at least one X-ray structure is available in
3 PAPERS • 1 BENCHMARK
MoleculeNet is a benchmark specially designed for testing machine learning methods of molecular properties. As we aim to facilitate the development of molecular machine learning method, this work curates a number of dataset collections, creates a suite of software that implements many known featurizations and previously proposed algorithms. All methods and datasets are integrated as parts of the open source DeepChem package(MIT license). MoleculeNet is built upon multiple public databases. The full collection currently includes over 700,000 compounds tested on a range of different properties. We test the performances of various machine learning models with different featurizations on the datasets(detailed descriptions here), with all results reported in AUC-ROC, AUC-PRC, RMSE and MAE scores. For users, please cite: Zhenqin Wu, Bharath Ramsundar, Evan N. Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S. Pappu, Karl Leswing, Vijay Pande, MoleculeNet: A Benchmark for Molecular Machine Lea
2 PAPERS • 1 BENCHMARK
CausalBench is a comprehensive benchmark suite for evaluating network inference methods on large-scale perturbational single-cell gene expression data. CausalBench introduces several biologically meaningful performance metrics and operates on two large, curated and openly available benchmark data sets for evaluating methods on the inference of gene regulatory networks from single-cell data generated under perturbations. The datasets consists of over 200000 training samples under interventions.
1 PAPER • NO BENCHMARKS YET
ImDrug is a comprehensive benchmark with an open-source Python library which consists of 4 imbalance settings, 11 AI-ready datasets, 54 learning tasks and 16 baseline algorithms tailored for imbalanced learning. It features modularized components including formulation of learning setting and tasks, dataset curation, standardized evaluation, and baseline algorithms. It also provides an accessible and customizable testbed for problems and solutions spanning a broad spectrum of the drug discovery pipeline such as molecular modeling, drug-target interaction and retrosynthesis.