ECB+ (extension to the EventCorefBank)

Introduced by Cybulska et al. in Using a sledgehammer to crack a nut? Lexical diversity and event coreference resolution

The ECB+ corpus is an extension to the EventCorefBank (ECB, Bejan and Harabagiu, 2010). A newly added corpus component consists of 502 documents that belong to the 43 topics of the ECB but that describe different seminal events than those already captured in the ECB. All corpus texts were found through Google Search and were annotated with mentions of events and their times, locations, human and non-human participants as well as with within- and cross-document event and entity coreference information. The 2012 version of annotation of the ECB corpus (Lee et al., 2012) was used as a starting point for re-annotation of the ECB according to the ECB+ annotation guideline.

The major differences with respect to the 2012 version of annotation of the ECB are:

(a) five event components are annotated in text:

actions (annotation tags starting with ACTION and NEG)
times (annotation tags starting with TIME)
locations (annotation tags starting with LOC)
human participants (annotation tags starting with HUMAN)
non-human participants (annotation tags starting with NON_HUMAN)

(b) specific action classes and entity subtypes are distinguished for each of the five main event components resulting in a total tagset of 30 annotation tags based on ACE annotation guidelines (LDC 2008), TimeML (Pustejovsky et al., 2003 and Sauri et al., 2005) (c) intra- and cross-document coreference relations between mentions of the five event components were established:

INTRA_DOC_COREF tag captures within document coreference chains that do not participate in cross-document relations; within document coreference was annotated by means of the CAT tool (Bartalesi et al., 2012)
CROSS_DOC_COREF tag indicates cross-document coreference relations created in the CROMER tool (Girardi et al., 2014); all coreference branches refer by means of relation target IDs to the so called TAG_DESCRIPTORS, pointing to human friendly instance names (assigned by coders) and also to instance_id-s

(d) events are annotated from an “event-centric” perspective, i.e. annotation tags are assigned depending on the role a mention plays in an event (for more information see ECB+ references).


Paper Code Results Date Stars

Dataset Loaders


Similar Datasets