TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Constituency Grammar Induction	PTB Diagnostic ECG Database	inside-outside co-training + weak supervision	Max F1 (WSJ)	66.8	# 2
Constituency Grammar Induction	PTB Diagnostic ECG Database	inside-outside co-training + weak supervision	Mean F1 (WSJ10)	74.2	# 1
Constituency Grammar Induction	PTB Diagnostic ECG Database	inside-outside co-training + weak supervision	Mean F1 (WSJ)	63.1	# 5

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/co-training-an-unsupervised-constituency/constituency-grammar-induction-on-ptb)](https://paperswithcode.com/sota/constituency-grammar-induction-on-ptb?p=co-training-an-unsupervised-constituency)`

Co-training an Unsupervised Constituency Parser with Weak Supervision

Findings (ACL) 2022 · Nickil Maveli, Shay B. Cohen ·

We introduce a method for unsupervised parsing that relies on bootstrapping classifiers to identify if a node dominates a specific span in a sentence. There are two types of classifiers, an inside classifier that acts on a span, and an outside classifier that acts on everything outside of a given span. Through self-training and co-training with the two classifiers, we show that the interplay between them helps improve the accuracy of both, and as a result, effectively parse. A seed bootstrapping technique prepares the data to train these classifiers. Our analyses further validate that such an approach in conjunction with weak supervision using prior branching knowledge of a known language (left/right-branching) and minimal heuristics injects strong inductive bias into the parser, achieving 63.1 F$_1$ on the English (PTB) test set. In addition, we show the effectiveness of our architecture by evaluating on treebanks for Chinese (CTB) and Japanese (KTB) and achieve new state-of-the-art results. Our code and pre-trained models are available at https://github.com/Nickil21/weakly-supervised-parsing.

PDF Abstract Findings (ACL) 2022 PDF Findings (ACL) 2022 Abstract

Code

Add Remove Mark official

Nickil21/weakly-supervised-parsing official

Tasks

Add Remove

Constituency Grammar Induction

Inductive Bias

Sentence

Datasets

Penn Treebank PTB Diagnostic ECG Database Chinese Treebank

Results from the Paper

Edit

Ranked #5 on Constituency Grammar Induction on PTB Diagnostic ECG Database (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Constituency Grammar Induction	PTB Diagnostic ECG Database	inside-outside co-training + weak supervision	Max F1 (WSJ)	66.8	# 2	Compare
			Mean F1 (WSJ10)	74.2	# 1	Compare
			Mean F1 (WSJ)	63.1	# 5	Compare

Methods

Add Remove

Test

Edit Social Preview

Co-training an Unsupervised Constituency Parser with Weak Supervision

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove