An Automated Framework for the Extraction of Semantic Legal Metadata from Legal Texts

30 Jan 2020  ·  Amin Sleimi, Nicolas Sannier, Mehrdad Sabetzadeh, Lionel Briand, Marcello Ceci, John Dann ·

Semantic legal metadata provides information that helps with understanding and interpreting legal provisions. Such metadata is therefore important for the systematic analysis of legal requirements. However, manually enhancing a large legal corpus with semantic metadata is prohibitively expensive. Our work is motivated by two observations: (1) the existing requirements engineering (RE) literature does not provide a harmonized view on the semantic metadata types that are useful for legal requirements analysis; (2) automated support for the extraction of semantic legal metadata is scarce, and it does not exploit the full potential of artificial intelligence technologies, notably natural language processing (NLP) and machine learning (ML). Our objective is to take steps toward overcoming these limitations. To do so, we review and reconcile the semantic legal metadata types proposed in the RE literature. Subsequently, we devise an automated extraction approach for the identified metadata types using NLP and ML. We evaluate our approach through two case studies over the Luxembourgish legislation. Our results indicate a high accuracy in the generation of metadata annotations. In particular, in the two case studies, we were able to obtain precision scores of 97.2% and 82.4% and recall scores of 94.9% and 92.4%.

PDF Abstract
No code implementations yet. Submit your code now

Categories


Software Engineering

Datasets


  Add Datasets introduced or used in this paper