Search Results for author: Christian Chiarcos

Found 50 papers, 4 papers with code

Querying a Dozen Corpora and a Thousand Years with Fintan

no code implementations LREC 2022 Christian Chiarcos, Christian Fäth, Maxim Ionov

Large-scale diachronic corpus studies covering longer time periods are difficult if more than one corpus are to be consulted and, as a result, different formats and annotation schemas need to be processed and queried in a uniform, comparable and replicable manner.

Modelling Collocations in OntoLex-FrAC

no code implementations gwll (LREC) 2022 Christian Chiarcos, Katerina Gkirtzou, Maxim Ionov, Besim Kabashi, Fahad Khan, Ciprian-Octavian Truică

Following presentations of frequency and attestations, and embeddings and distributional similarity, this paper introduces the third cornerstone of the emerging OntoLex module for Frequency, Attestation and Corpus-based Information, OntoLex-FrAC.

ISO-based Annotated Multilingual Parallel Corpus for Discourse Markers

no code implementations LREC 2022 Purificação Silvano, Mariana Damova, Giedrė Valūnaitė Oleškevičienė, Chaya Liebeskind, Christian Chiarcos, Dimitar Trajanov, Ciprian-Octavian Truică, Elena-Simona Apostol, Anna Baczkowska

In order to represent the meaning of the discourse markers, we propose an annotation scheme of discourse relations from ISO 24617-8 with a plug-in to ISO 24617-2 for communicative functions.

Unifying Morphology Resources with OntoLex-Morph. A Case Study in German

no code implementations LREC 2022 Christian Chiarcos, Christian Fäth, Maxim Ionov

The OntoLex vocabulary has become a widely used community standard for machine-readable lexical resources on the web.

MORPH

Inducing Discourse Marker Inventories from Lexical Knowledge Graphs

1 code implementation LREC 2022 Christian Chiarcos

Discourse marker inventories are important tools for the development of both discourse parsers and corpora with discourse annotations.

Knowledge Graphs

Spicy Salmon: Converting between 50+ Annotation Formats with Fintan, Pepper, Salt and Powla

no code implementations LDL (ACL) 2022 Christian Fäth, Christian Chiarcos

Heterogeneity of formats, models and annotations has always been a primary hindrance for exploiting the ever increasing amount of existing linguistic resources for real world applications in and beyond NLP.

A Cheap and Dirty Cross-Lingual Linking Service in the Cloud

no code implementations LDL (ACL) 2022 Christian Chiarcos, Gilles Sérasset

In this paper, we describe the application of Linguistic Linked Open Data (LLOD) technology for dynamic cross-lingual querying on demand.

Knowledge Graphs

Cross-Lingual Link Discovery for Under-Resourced Languages

no code implementations LREC 2022 Michael Rosner, Sina Ahmadi, Elena-Simona Apostol, Julia Bosque-Gil, Christian Chiarcos, Milan Dojchinovski, Katerina Gkirtzou, Jorge Gracia, Dagmar Gromann, Chaya Liebeskind, Giedrė Valūnaitė Oleškevičienė, Gilles Sérasset, Ciprian-Octavian Truică

In this paper, we provide an overview of current technologies for cross-lingual link discovery, and we discuss challenges, experiences and prospects of their application to under-resourced languages.

A Survey of Guidelines and Best Practices for the Generation, Interlinking, Publication, and Validation of Linguistic Linked Data

no code implementations LDL (ACL) 2022 Fahad Khan, Christian Chiarcos, Thierry Declerck, Maria Pia di Buono, Milan Dojchinovski, Jorge Gracia, Giedre Valunaite Oleskeviciene, Daniela Gifu

This article discusses a survey carried out within the NexusLinguarum COST Action which aimed to give an overview of existing guidelines (GLs) and best practices (BPs) in linguistic linked data.

On the Linguistic Linked Open Data Infrastructure

no code implementations LREC 2020 Christian Chiarcos, Bettina Klimek, Christian F{\"a}th, Thierry Declerck, John Philip McCrae

In this paper we describe the current state of development of the Linguistic Linked Open Data (LLOD) infrastructure, an LOD(sub-)cloud of linguistic resources, which covers various linguistic data bases, lexicons, corpora, terminology and metadata repositories. We give in some details an overview of the contributions made by the European H2020 projects {``}Pr{\^e}t-{\`a}-LLOD{''} ({`}Ready-to-useMultilingual Linked Language Data for Knowledge Services across Sectors{'}) and {``}ELEXIS{''} ({`}European Lexicographic Infrastructure{'}) to the further development of the LLOD.

The ACoLi Dictionary Graph

no code implementations LREC 2020 Christian Chiarcos, Christian F{\"a}th, Maxim Ionov

In this paper, we report the release of the ACoLi Dictionary Graph, a large-scale collection of multilingual open source dictionaries available in two machine-readable formats, a graph representation in RDF, using the OntoLex-Lemon vocabulary, and a simple tabular data format to facilitate their use in NLP tasks, such as translation inference across dictionaries.

Translation

Modelling Frequency and Attestations for OntoLex-Lemon

no code implementations LREC 2020 Christian Chiarcos, Maxim Ionov, Jesse de Does, Katrien Depuydt, Anas Fahad Khan, S Stolk, er, Thierry Declerck, John Philip McCrae

Therefore, the OntoLex community has put forward the proposal for a novel module for frequency, attestation and corpus information (FrAC), that not only covers the requirements of digital lexicography, but also accommodates essential data structures for lexical information in natural language processing.

Translation Inference by Concept Propagation

no code implementations LREC 2020 Christian Chiarcos, Niko Schenk, Christian F{\"a}th

We describe an approach on translation inference based on symbolic methods, the propagation of concepts over a graph of interconnected dictionaries: Given a mapping from source language words to lexical concepts (e. g., synsets) as a seed, we use bilingual dictionaries to extrapolate a mapping of pivot and target language words to these lexical concepts.

Translation

Recent Developments for the Linguistic Linked Open Data Infrastructure

no code implementations LREC 2020 Thierry Declerck, John Philip McCrae, Matthias Hartung, Jorge Gracia, Christian Chiarcos, Elena Montiel-Ponsoda, Philipp Cimiano, Artem Revenko, Roser Saur{\'\i}, Deirdre Lee, Stefania Racioppa, Jamal Abdul Nasir, Matthias Orlikowsk, Marta Lanau-Coronas, Christian F{\"a}th, Mariano Rico, Mohammad Fazleh Elahi, Maria Khvalchik, Meritxell Gonzalez, Katharine Cooney

In this paper we describe the contributions made by the European H2020 project {``}Pr{\^e}t-{\`a}-LLOD{''} ({`}Ready-to-use Multilingual Linked Language Data for Knowledge Services across Sectors{'}) to the further development of the Linguistic Linked Open Data (LLOD) infrastructure.

Fintan - Flexible, Integrated Transformation and Annotation eNgineering

no code implementations LREC 2020 Christian F{\"a}th, Christian Chiarcos, Bj{\"o}rn Ebbrecht, Maxim Ionov

We introduce the Flexible and Integrated Transformation and Annotation eNgeneering (Fintan) platform for converting heterogeneous linguistic resources to RDF.

Entity Linking Management

A Tree Extension for CoNLL-RDF

no code implementations LREC 2020 Christian Chiarcos, Luis Glaser

The technological bridges between knowledge graphs and natural language processing are of utmost importance for the future development of language technology.

Knowledge Graphs Sentence

Annotation Interoperability for the Post-ISOCat Era

no code implementations LREC 2020 Christian Chiarcos, Christian F{\"a}th, Frank Abromeit

With this paper, we provide an overview over ISOCat successor solutions and annotation standardization efforts since 2010, and we describe the low-cost harmonization of post-ISOCat vocabularies by means of modular, linked ontologies: The CLARIN Concept Registry, LexInfo, Universal Parts of Speech, Universal Dependencies and UniMorph are linked with the Ontologies of Linguistic Annotation and through it with ISOCat, the GOLD ontology, the Typological Database Systems ontology and a large number of annotation schemes.

Machine Translation and Automated Analysis of the Sumerian Language

no code implementations WS 2017 {\'E}milie Pag{\'e}-Perron, Maria Sukhareva, Ilya Khait, Christian Chiarcos

The methodology includes creation of a specialized NLP pipeline and also the use of linguistic linked open data to increase access to the results.

Information Retrieval Machine Translation +2

A Recurrent Neural Model with Attention for the Recognition of Chinese Implicit Discourse Relations

no code implementations ACL 2017 Samuel Rönnqvist, Niko Schenk, Christian Chiarcos

We introduce an attention-based Bi-LSTM for Chinese implicit discourse relations and demonstrate that modeling argument pairs as a joint sequence can outperform word order-agnostic approaches.

Corpora and Linguistic Linked Open Data: Motivations, Applications, Limitations

no code implementations JEPTALNRECITAL 2016 Christian Chiarcos

Linguistic Linked Open Data (LLOD) is a technology and a movement in several disciplines working with language resources, including Natural Language Processing, general linguistics, computational lexicography and the localization industry.

The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud

no code implementations LREC 2016 John Philip McCrae, Christian Chiarcos, Francis Bond, Philipp Cimiano, Thierry Declerck, Gerard de Melo, Jorge Gracia, Sebastian Hellmann, Bettina Klimek, Steven Moran, Petya Osenova, Antonio Pareja-Lora, Jonathan Pool

The Open Linguistics Working Group (OWLG) brings together researchers from various fields of linguistics, natural language processing, and information technology to present and discuss principles, case studies, and best practices for representing, publishing and linking linguistic data collections.

Word Segmentation for Akkadian Cuneiform

no code implementations LREC 2016 Timo Homburg, Christian Chiarcos

We present experiments on word segmentation for Akkadian cuneiform, an ancient writing system and a language used for about 3 millennia in the ancient Near East.

Segmentation

Lin|gu|is|tik: Building the Linguist's Pathway to Bibliographies, Libraries, Language Resources and Linked Open Data

no code implementations LREC 2016 Christian Chiarcos, Christian F{\"a}th, Heike Renner-Westermann, Frank Abromeit, Vanya Dimitrova

This paper introduces a novel research tool for the field of linguistics: The Lin|gu|is|tik web portal provides a virtual library which offers scientific information on every linguistic subject.

Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German

no code implementations LREC 2016 Maria Sukhareva, Christian Chiarcos

In this paper, we describe experiments on the morphosyntactic annotation of historical language varieties for the example of Middle Low German (MLG), the official language of the German Hanse during the Middle Ages and a dominant language around the Baltic Sea by the time.

POS

Towards interoperable discourse annotation. Discourse features in the Ontologies of Linguistic Annotation

no code implementations LREC 2014 Christian Chiarcos

This paper describes the extension of the Ontologies of Linguistic Annotation (OLiA) with respect to discourse features.

A generic formalism to represent linguistic corpora in RDF and OWL/DL

no code implementations LREC 2012 Christian Chiarcos

This paper describes POWLA, a generic formalism to represent linguistic corpora by means of RDF and OWL/DL.

Ontologies of Linguistic Annotation: Survey and perspectives

no code implementations LREC 2012 Christian Chiarcos

This paper announces the release of the Ontologies of Linguistic Annotation (OLiA).

Cannot find the paper you are looking for? You can Submit a new open access paper.