The Bacteria Biotope (BB) Task is part of the BioNLP Open Shared Tasks and meets the BioNLP-OST standards of quality, originality and data formats. Manually annotated data is provided for training, development and evaluation of information extraction methods. Tools for the detailed evaluation of system outputs are available. Support in performing linguistic processing are provided in the form of analyses created by various state-of-the art tools on the dataset texts.
Biology and bioinformatics projects produce huge amounts of heterogeneous information about the microbial strains that have been experimentally identified in a given environment (habitat), and theirs properties (phenotype). These projects include applied microbiology domain (food safety), health sciences and waste processing. Knowledge about microbial diversity is critical for studying in depth the microbiome, the interaction mechanisms of bacteria with their environment from genetic, phylogenetic and ecology perspectives. A large part of the information is expressed in free text in large sets of scientific papers, web pages or databases. Thus, automatic systems are needed to extract the relevant information. The BB task aims to encourage the development of such systems.
The BB Task is an information extraction task involving entity recognition, entity normalization and relation extraction. The BB Task consists in recognizing mentions of microorganisms and microbial biotopes and phenotypes in scientific and textbook text, normalizing these mentions according to domain knowledge resources (a taxonomy and an ontology), and extracting relations between them. It is the new edition of the Bacteria Biotope task previously run at BioNLP Shared Task 2016, 2013 and 2011. The task has been extended to include new entity and relation types and new documents.