N-ary Constituent Tree Parsing with Recursive Semi-Markov Model

ACL 2021  ·  Xin Xin, Jinlong Li, Zeqi Tan ·

In this paper, we study the task of graph-based constituent parsing in the setting that binarization is not conducted as a pre-processing step, where a constituent tree may consist of nodes with more than two children. Previous graph-based methods on this setting typically generate hidden nodes with the dummy label inside the n-ary nodes, in order to transform the tree into a binary tree for prediction. The limitation is that the hidden nodes break the sibling relations of the n-ary node’s children. Consequently, the dependencies of such sibling constituents might not be accurately modeled and is being ignored. To solve this limitation, we propose a novel graph-based framework, which is called “recursive semi-Markov model”. The main idea is to utilize 1-order semi-Markov model to predict the immediate children sequence of a constituent candidate, which then recursively serves as a child candidate of its parent. In this manner, the dependencies of sibling constituents can be described by 1-order transition features, which solves the above limitation. Through experiments, the proposed framework obtains the F1 of 95.92% and 92.50% on the datasets of PTB and CTB 5.1 respectively. Specially, the recursive semi-Markov model shows advantages in modeling nodes with more than two children, whose average F1 can be improved by 0.3-1.1 points in PTB and 2.3-6.8 points in CTB 5.1.

PDF Abstract ACL 2021 PDF ACL 2021 Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Constituency Parsing CTB5 N-ary semi-markov + BERT F1 score 92.50 # 3
Constituency Parsing Penn Treebank N-ary semi-markov + BERT-large F1 score 95.92 # 8

Methods


No methods listed for this paper. Add relevant methods here