Language Model Pre-Training


Introduced by Deng et al. in ReasonBERT: Pre-trained to Reason with Distant Supervision

ReasonBERT is a pre-training method that augments language models with the ability to reason over long-range relations and multiple, possibly hybrid, contexts. It utilizes distant supervision to automatically connect multiple pieces of text and tables to create pre-training examples that require long-range reasoning. Different types of reasoning are simulated, including intersecting multiple pieces of evidence, bridging from one piece of evidence to another, and detecting unanswerable cases.

Specifically, given a query sentence containing an entity pair, if we mask one of the entities, another sentence or table that contains the same pair of entities can likely be used as evidence to recover the masked entity. Moreover, to encourage deeper reasoning, multiple pieces of evidence are collected that are jointly used to recover the masked entities in the query sentence, allowing for the scattering of the masked entities among different pieces of evidence to mimic different types of reasoning.

The Figure illustrates several examples using such distant supervision. In Ex. 1, a model needs to check multiple constraints (i.e., intersection reasoning type) and find “the beach soccer competition that is established in 1998.” In Ex. 2, a model needs to find “the type of the band that released Awaken the Guardian,” by first inferring the name of the band “Fates Warning” (i.e., bridging reasoning type).

The masked entities in a query sentence are replaced with the [QUESTION] tokens. The new pre-training objective, span reasoning, then extracts the masked entities from the provided evidence. Existing LMs like BERT and RoBERTa are augmented by continuing to train them with the new objective, which leads to ReasonBERT. Then query sentence and textual evidence are encoded via the LM. When tabular evidence is present, the structure-aware transformer TAPAS is used as the encoder to capture the table structure.

Source: ReasonBERT: Pre-trained to Reason with Distant Supervision


Paper Code Results Date Stars


Task Papers Share
Question Answering 1 50.00%
Semantic Parsing 1 50.00%


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign