Compound Probabilistic Context-Free Grammars for Grammar Induction

ACL 2019  ·  Yoon Kim, Chris Dyer, Alexander M. Rush ·

We study a formalization of the grammar induction problem that models sentences as being generated by a compound probabilistic context-free grammar. In contrast to traditional formulations which learn a single stochastic grammar, our grammar's rule probabilities are modulated by a per-sentence continuous latent variable, which induces marginal dependencies beyond the traditional context-free assumptions. Inference in this grammar is performed by collapsed variational inference, in which an amortized variational posterior is placed on the continuous variable, and the latent trees are marginalized out with dynamic programming. Experiments on English and Chinese show the effectiveness of our approach compared to recent state-of-the-art methods when evaluated on unsupervised parsing.

PDF Abstract ACL 2019 PDF ACL 2019 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Constituency Grammar Induction PTB Diagnostic ECG Database Neural PCFG Max F1 (WSJ) 52.6 # 7
Mean F1 (WSJ) 50.8 # 11
Constituency Grammar Induction PTB Diagnostic ECG Database Compound PCFG Max F1 (WSJ) 60.1 # 5
Max F1 (WSJ10) 68.8 # 2
Mean F1 (WSJ) 55.2 # 10

Methods


No methods listed for this paper. Add relevant methods here