Preserving Semantic Information from Old Dictionaries: Linking Senses of the `Altfranz\"osisches W\"orterbuch' to WordNet

LREC 2020  ·  Achim Stein ·

Historical dictionaries of the pre-digital period are important resources for the study of older languages. Taking the example of the {`}Altfranz{\"o}sisches W{\"o}rterbuch{'}, an Old French dictionary published from 1925 onwards, this contribution shows how the printed dictionaries can be turned into a more easily accessible and more sustainable lexical database, even though a full-text retro-conversion is too costly. Over 57,000 German sense definitions were identified in uncorrected OCR output. For verbs and nouns, 34,000 senses of more than 20,000 lemmas were matched with GermaNet, a semantic network for German, and, in a second step, linked to synsets of the English WordNet. These results are relevant for the automatic processing of Old French, for the annotation and exploitation of Old French text corpora, and for the philological study of Old French in general.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here