Annotation of specialized corpora using a comprehensive entity and relation scheme

Annotated corpora are essential resources for many applications in Natural Language Processing. They provide insight on the linguistic and semantic characteristics of the genre and domain covered, and can be used for the training and evaluation of automatic tools. In the biomedical domain, annotated corpora of English texts have become available for several genres and subfields. However, very few similar resources are available for languages other than English. In this paper we present an effort to produce a high-quality corpus of clinical documents in French, annotated with a comprehensive scheme of entities and relations. We present the annotation scheme as well as the results of a pilot annotation study covering 35 clinical documents in a variety of subfields and genres. We show that high inter-annotator agreement can be achieved using a complex annotation scheme.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here