no code implementations • LDL (ACL) 2022 • Fahad Khan, Christian Chiarcos, Thierry Declerck, Maria Pia di Buono, Milan Dojchinovski, Jorge Gracia, Giedre Valunaite Oleskeviciene, Daniela Gifu
This article discusses a survey carried out within the NexusLinguarum COST Action which aimed to give an overview of existing guidelines (GLs) and best practices (BPs) in linguistic linked data.
no code implementations • LDL (ACL) 2022 • Christian Fäth, Christian Chiarcos
Heterogeneity of formats, models and annotations has always been a primary hindrance for exploiting the ever increasing amount of existing linguistic resources for real world applications in and beyond NLP.
no code implementations • LDL (ACL) 2022 • Christian Chiarcos, Gilles Sérasset
In this paper, we describe the application of Linguistic Linked Open Data (LLOD) technology for dynamic cross-lingual querying on demand.
no code implementations • LREC 2022 • Christian Chiarcos, Christian Fäth, Maxim Ionov
The OntoLex vocabulary has become a widely used community standard for machine-readable lexical resources on the web.
no code implementations • LREC 2022 • Christian Chiarcos, Christian Fäth, Maxim Ionov
Large-scale diachronic corpus studies covering longer time periods are difficult if more than one corpus are to be consulted and, as a result, different formats and annotation schemas need to be processed and queried in a uniform, comparable and replicable manner.
no code implementations • LREC 2022 • Purificação Silvano, Mariana Damova, Giedrė Valūnaitė Oleškevičienė, Chaya Liebeskind, Christian Chiarcos, Dimitar Trajanov, Ciprian-Octavian Truică, Elena-Simona Apostol, Anna Baczkowska
In order to represent the meaning of the discourse markers, we propose an annotation scheme of discourse relations from ISO 24617-8 with a plug-in to ISO 24617-2 for communicative functions.
1 code implementation • LREC 2022 • Christian Chiarcos
Discourse marker inventories are important tools for the development of both discourse parsers and corpora with discourse annotations.
no code implementations • LREC 2022 • Michael Rosner, Sina Ahmadi, Elena-Simona Apostol, Julia Bosque-Gil, Christian Chiarcos, Milan Dojchinovski, Katerina Gkirtzou, Jorge Gracia, Dagmar Gromann, Chaya Liebeskind, Giedrė Valūnaitė Oleškevičienė, Gilles Sérasset, Ciprian-Octavian Truică
In this paper, we provide an overview of current technologies for cross-lingual link discovery, and we discuss challenges, experiences and prospects of their application to under-resourced languages.
no code implementations • gwll (LREC) 2022 • Christian Chiarcos, Katerina Gkirtzou, Maxim Ionov, Besim Kabashi, Fahad Khan, Ciprian-Octavian Truică
Following presentations of frequency and attestations, and embeddings and distributional similarity, this paper introduces the third cornerstone of the emerging OntoLex module for Frequency, Attestation and Corpus-based Information, OntoLex-FrAC.
no code implementations • COLING 2022 • Christian Chiarcos, Elena-Simona Apostol, Besim Kabashi, Ciprian-Octavian Truică
OntoLex-Lemon has become a de facto standard for lexical resources in the web of data.
no code implementations • LDL (ACL) 2022 • Christian Chiarcos, Katerina Gkirtzou, Fahad Khan, Penny Labropoulou, Marco Passarotti, Matteo Pellegrini
This paper describes the current status of the emerging OntoLex module for linguistic morphology.
no code implementations • COLING 2020 • Ravneet Punia, Niko Schenk, Christian Chiarcos, {\'E}milie Pag{\'e}-Perron
The Sumerian cuneiform script was invented more than 5, 000 years ago and represents one of the oldest in history.
Cultural Vocal Bursts Intensity Prediction
Machine Translation
+2
no code implementations • LREC 2020 • Christian Chiarcos, Maxim Ionov, Jesse de Does, Katrien Depuydt, Anas Fahad Khan, S Stolk, er, Thierry Declerck, John Philip McCrae
Therefore, the OntoLex community has put forward the proposal for a novel module for frequency, attestation and corpus information (FrAC), that not only covers the requirements of digital lexicography, but also accommodates essential data structures for lexical information in natural language processing.
no code implementations • LREC 2020 • Christian Chiarcos, Niko Schenk, Christian F{\"a}th
We describe an approach on translation inference based on symbolic methods, the propagation of concepts over a graph of interconnected dictionaries: Given a mapping from source language words to lexical concepts (e. g., synsets) as a seed, we use bilingual dictionaries to extrapolate a mapping of pivot and target language words to these lexical concepts.
no code implementations • LREC 2020 • Christian Chiarcos, Bettina Klimek, Christian F{\"a}th, Thierry Declerck, John Philip McCrae
In this paper we describe the current state of development of the Linguistic Linked Open Data (LLOD) infrastructure, an LOD(sub-)cloud of linguistic resources, which covers various linguistic data bases, lexicons, corpora, terminology and metadata repositories. We give in some details an overview of the contributions made by the European H2020 projects {``}Pr{\^e}t-{\`a}-LLOD{''} ({`}Ready-to-useMultilingual Linked Language Data for Knowledge Services across Sectors{'}) and {``}ELEXIS{''} ({`}European Lexicographic Infrastructure{'}) to the further development of the LLOD.
no code implementations • LREC 2020 • Thierry Declerck, John Philip McCrae, Matthias Hartung, Jorge Gracia, Christian Chiarcos, Elena Montiel-Ponsoda, Philipp Cimiano, Artem Revenko, Roser Saur{\'\i}, Deirdre Lee, Stefania Racioppa, Jamal Abdul Nasir, Matthias Orlikowsk, Marta Lanau-Coronas, Christian F{\"a}th, Mariano Rico, Mohammad Fazleh Elahi, Maria Khvalchik, Meritxell Gonzalez, Katharine Cooney
In this paper we describe the contributions made by the European H2020 project {``}Pr{\^e}t-{\`a}-LLOD{''} ({`}Ready-to-use Multilingual Linked Language Data for Knowledge Services across Sectors{'}) to the further development of the Linguistic Linked Open Data (LLOD) infrastructure.
no code implementations • LREC 2020 • Christian Chiarcos, Luis Glaser
The technological bridges between knowledge graphs and natural language processing are of utmost importance for the future development of language technology.
no code implementations • LREC 2020 • Christian F{\"a}th, Christian Chiarcos, Bj{\"o}rn Ebbrecht, Maxim Ionov
We introduce the Flexible and Integrated Transformation and Annotation eNgeneering (Fintan) platform for converting heterogeneous linguistic resources to RDF.
no code implementations • LREC 2020 • Christian Chiarcos, Christian F{\"a}th, Frank Abromeit
With this paper, we provide an overview over ISOCat successor solutions and annotation standardization efforts since 2010, and we describe the low-cost harmonization of post-ISOCat vocabularies by means of modular, linked ontologies: The CLARIN Concept Registry, LexInfo, Universal Parts of Speech, Universal Dependencies and UniMorph are linked with the Ontologies of Linguistic Annotation and through it with ISOCat, the GOLD ontology, the Typological Database Systems ontology and a large number of annotation schemes.
no code implementations • LREC 2020 • Christian Chiarcos, Christian F{\"a}th, Maxim Ionov
In this paper, we report the release of the ACoLi Dictionary Graph, a large-scale collection of multilingual open source dictionaries available in two machine-readable formats, a graph representation in RDF, using the OntoLex-Lemon vocabulary, and a simple tabular data format to facilitate their use in NLP tasks, such as translation inference across dictionaries.
1 code implementation • LREC 2020 • Georg Rehm, Dimitrios Galanis, Penny Labropoulou, Stelios Piperidis, Martin Welß, Ricardo Usbeck, Joachim köhler, Miltos Deligiannis, Katerina Gkirtzou, Johannes Fischer, Christian Chiarcos, Nils Feldhus, Julián Moreno-Schneider, Florian Kintzel, Elena Montiel, Víctor Rodríguez Doncel, John P. McCrae, David Laqua, Irina Patricia Theile, Christian Dittmar, Kalina Bontcheva, Ian Roberts, Andrejs Vasiljevs, Andis Lagzdiņš
With regard to the wider area of AI/LT platform interoperability, we concentrate on two core aspects: (1) cross-platform search and discovery of resources and services; (2) composition of cross-platform service workflows.
no code implementations • WS 2017 • {\'E}milie Pag{\'e}-Perron, Maria Sukhareva, Ilya Khait, Christian Chiarcos
The methodology includes creation of a specialized NLP pipeline and also the use of linguistic linked open data to increase access to the results.
no code implementations • ACL 2017 • Samuel Rönnqvist, Niko Schenk, Christian Chiarcos
We introduce an attention-based Bi-LSTM for Chinese implicit discourse relations and demonstrate that modeling argument pairs as a joint sequence can outperform word order-agnostic approaches.
no code implementations • WS 2017 • Niko Schenk, Christian Chiarcos
We present a resource-lean neural recognizer for modeling coherence in commonsense stories.
no code implementations • JEPTALNRECITAL 2016 • Christian Chiarcos
Linguistic Linked Open Data (LLOD) is a technology and a movement in several disciplines working with language resources, including Natural Language Processing, general linguistics, computational lexicography and the localization industry.
no code implementations • LREC 2016 • Christian Chiarcos, Christian F{\"a}th, Heike Renner-Westermann, Frank Abromeit, Vanya Dimitrova
This paper introduces a novel research tool for the field of linguistics: The Lin|gu|is|tik web portal provides a virtual library which offers scientific information on every linguistic subject.
no code implementations • LREC 2016 • Timo Homburg, Christian Chiarcos
We present experiments on word segmentation for Akkadian cuneiform, an ancient writing system and a language used for about 3 millennia in the ancient Near East.
no code implementations • LREC 2016 • John Philip McCrae, Christian Chiarcos, Francis Bond, Philipp Cimiano, Thierry Declerck, Gerard de Melo, Jorge Gracia, Sebastian Hellmann, Bettina Klimek, Steven Moran, Petya Osenova, Antonio Pareja-Lora, Jonathan Pool
The Open Linguistics Working Group (OWLG) brings together researchers from various fields of linguistics, natural language processing, and information technology to present and discuss principles, case studies, and best practices for representing, publishing and linking linguistic data collections.
no code implementations • LREC 2016 • Maria Sukhareva, Christian Chiarcos
In this paper, we describe experiments on the morphosyntactic annotation of historical language varieties for the example of Middle Low German (MLG), the official language of the German Hanse during the Middle Ages and a dominant language around the Baltic Sea by the time.
no code implementations • LREC 2014 • Christian Chiarcos
This paper describes the extension of the Ontologies of Linguistic Annotation (OLiA) with respect to discourse features.
no code implementations • LREC 2012 • Christian Chiarcos
This paper describes POWLA, a generic formalism to represent linguistic corpora by means of RDF and OWL/DL.
no code implementations • LREC 2012 • Christian Chiarcos
This paper announces the release of the Ontologies of Linguistic Annotation (OLiA).
no code implementations • LREC 2012 • Christian Chiarcos, Sebastian Hellmann, Sebastian Nordhoff, Steven Moran, Richard Littauer, Judith Eckle-Kohler, Iryna Gurevych, Silvana Hartmann, Michael Matuschek, Christian M. Meyer
This paper describes the Open Linguistics Working Group (OWLG) of the Open Knowledge Foundation (OKFN).