Span-based discontinuous constituency parsing: a family of exact chart-based algorithms with time complexities from O(n\^6) down to O(n\^3)

EMNLP 2020  ·  Caio Corro ·

We introduce a novel chart-based algorithm for span-based parsing of discontinuous constituency trees of block degree two, including ill-nested structures. In particular, we show that we can build variants of our parser with smaller search spaces and time complexities ranging from O(n{\^{}}6) down to O(n{\^{}}3). The cubic time variant covers 98{\%} of constituents observed in linguistic treebanks while having the same complexity as continuous constituency parsers. We evaluate our approach on German and English treebanks (Negra, Tiger, and DPTB) and report state-of-the-art results in the fully supervised setting. We also experiment with pre-trained word embeddings and Bert-based neural networks.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here