FrameNet is a linguistic knowledge graph containing information about lexical and predicate argument semantics of the English language. FrameNet contains two distinct entity classes: frames and lexical units, where a frame is a meaning and a lexical unit is a single meaning for a word.
416 PAPERS • NO BENCHMARKS YET
The CoNLL-2012 shared task involved predicting coreference in English, Chinese, and Arabic, using the final version, v5.0, of the OntoNotes corpus. It was a follow-on to the English-only task organized in 2011.
83 PAPERS • 4 BENCHMARKS
OntoNotes 5.0 is a large corpus comprising various genres of text (news, conversational telephone speech, weblogs, usenet newsgroups, broadcast, talk shows) in three languages (English, Chinese, and Arabic) with structural information (syntax and predicate argument structure) and shallow semantics (word sense linked to an ontology and coreference).
61 PAPERS • 11 BENCHMARKS
NomBank is an annotation project at New York University that is related to the PropBank project at the University of Colorado. The goal is to mark the sets of arguments that cooccur with nouns in the PropBank Corpus (the Wall Street Journal Corpus of the Penn Treebank), just as PropBank records such information for verbs. As a side effect of the annotation process, the authors are producing a number of other resources including various dictionaries, as well as PropBank style lexical entries called frame files. These resources help the user label the various arguments and adjuncts of the head nouns with roles (sets of argument labels for each sense of each noun). NYU and U of Colorado are making a coordinated effort to insure that, when possible, role definitions are consistent across parts of speech. For example, PropBank's frame file for the verb "decide" was used in the annotation of the noun "decision".
47 PAPERS • NO BENCHMARKS YET
QA-SRL was proposed as an open schema for semantic roles, in which the relation between an argument and a predicate is expressed as a natural-language question containing the predicate (“Where was someone educated?”) whose answer is the argument (“Princeton”). The authors collected about 19,000 question-answer pairs from 3,200 sentences.
36 PAPERS • NO BENCHMARKS YET
The task builds on the CoNLL-2008 task and extends it to multiple languages. The core of the task is to predict syntactic and semantic dependencies and their labeling. Data is provided for both statistical training and evaluation, which extract these labeled dependencies from manually annotated treebanks such as the Penn Treebank for English, the Prague Dependency Treebank for Czech and similar treebanks for Catalan, Chinese, German, Japanese and Spanish languages, enriched with semantic relations (such as those captured in the Prop/Nombank and similar resources). Great effort has been devoted to provide the participants with a common and relatively simple data representation for all the languages, similar to the last year's English data.
8 PAPERS • 3 BENCHMARKS
QA-SRL Bank 2.0 is a large-scale corpus of Question-Answer driven Semantic Role Labeling (QA-SRL) annotations. The corpus consists of over 250,000 question-answer pairs for over 64,000 sentences across 3 domains and was gathered with a new crowd-sourcing scheme that was shown to have high precision and good recall at modest cost.
5 PAPERS • NO BENCHMARKS YET
SRL is the task of extracting semantic predicate-argument structures from sentences. X-SRL is a multilingual parallel Semantic Role Labelling (SRL) corpus for English (EN), German (DE), French (FR) and Spanish (ES) that is based on English gold annotations and shares the same labelling scheme across languages.
4 PAPERS • NO BENCHMARKS YET