PhysNLU: A Language Resource for Evaluating Natural Language Understanding and Explanation Coherence in Physics

LREC 2022  ·  Jordan Meadows, Zili Zhou, Andre Freitas ·

In order for language models to aid physics research, they must first encode representations of mathematical and natural language discourse which lead to coherent explanations, with correct ordering and relevance of statements. We present a collection of datasets developed to evaluate the performance of language models in this regard, which measure capabilities with respect to sentence ordering, position, section prediction, and discourse coherence. Analysis of the data reveals equations and sub-disciplines which are most common in physics discourse, as well as the sentence-level frequency of equations and expressions. We present baselines that demonstrate how contemporary language models are challenged by coherence related tasks in physics, even when trained on mathematical natural language objectives.

PDF Abstract LREC 2022 PDF LREC 2022 Abstract

Datasets


Introduced in the Paper:

PhysNLU

Used in the Paper:

WikiText-2 WikiText-103

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here