Search Results for author: Ralf Steinberger

Found 20 papers, 0 papers with code

JRC TMA-CC: Slavic Named Entity Recognition and Linking. Participation in the BSNLP-2019 shared task

no code implementations WS 2019 Guillaume Jacquet, Jakub Piskorski, Hristo Tanev, Ralf Steinberger

We report on the participation of the JRC Text Mining and Analysis Competence Centre (TMA-CC) in the BSNLP-2019 Shared Task, which focuses on named-entity recognition, lemmatisation and cross-lingual linking.

named-entity-recognition Named Entity Recognition +1

Large-scale news entity sentiment analysis

no code implementations RANLP 2017 Ralf Steinberger, Stefanie Hegele, Hristo Tanev, Leonida della Rocca

We work on detecting positive or negative sentiment towards named entities in very large volumes of news articles.

Bias Detection Negation +2

Multi-word Entity Classification in a Highly Multilingual Environment

no code implementations WS 2017 Sophie Chesney, Guillaume Jacquet, Ralf Steinberger, Jakub Piskorski

This paper describes an approach for the classification of millions of existing multi-word entities (MWEntities), such as organisation or event names, into thirteen category types, based only on the tokens they contain.

Classification General Classification +1

Observing Trends in Automated Multilingual Media Analysis

no code implementations8 Mar 2016 Ralf Steinberger, Aldo Podavini, Alexandra Balahur, Guillaume Jacquet, Hristo Tanev, Jens Linge, Martin Atkinson, Michele Chinosi, Vanni Zavarella, Yaniv Steiner, Erik van der Goot

Any large organisation, be it public or private, monitors the media for information to keep abreast of developments in their field of interest, and usually also to become aware of positive or negative opinions expressed towards them.

Experiments to Improve Named Entity Recognition on Turkish Tweets

no code implementations WS 2014 Dilek Küçük, Ralf Steinberger

In these experiments, starting with a baseline named entity recognition system, we adapt its recognition rules and resources to better fit Twitter language by relaxing its capitalization constraint and by diacritics-based expansion of its lexical resources, and we employ a simplistic normalization scheme on tweets to observe the effects of these on the overall named entity recognition performance on Turkish tweets.

named-entity-recognition Named Entity Recognition +2

Named Entity Recognition on Turkish Tweets

no code implementations LREC 2014 Dilek K{\"u}{\c{c}}{\"u}k, Guillaume Jacquet, Ralf Steinberger

Various recent studies show that the performance of named entity recognition (NER) systems developed for well-formed text types drops significantly when applied to tweets.

named-entity-recognition Named Entity Recognition +1

DCEP -Digital Corpus of the European Parliament

no code implementations LREC 2014 Najeh Hajlaoui, David Kolovratnik, Jaakko V{\"a}yrynen, Ralf Steinberger, Daniel Varga

We are presenting a new highly multilingual document-aligned parallel corpus called DCEP - Digital Corpus of the European Parliament.

Machine Translation Sentence +1

Media monitoring and information extraction for the highly inflected agglutinative language Hungarian

no code implementations LREC 2014 J{\'u}lia Pajzs, Ralf Steinberger, Maud Ehrmann, Mohamed Ebrahim, Leonida della Rocca, Stefano Bucci, Eszter Simon, Tam{\'a}s V{\'a}radi

In this paper, we describe the effort of adding to EMM Hungarian text mining tools for news gathering; document categorisation; named entity recognition and classification for persons, organisations and locations; name lemmatisation; quotation recognition; and cross-lingual linking of related news clusters.

Information Retrieval named-entity-recognition +2

A survey of methods to ease the development of highly multilingual text mining applications

no code implementations13 Jan 2014 Ralf Steinberger

While Information Extraction and other text mining software can, in principle, be developed for many languages, most text analysis tools have only been applied to small sets of languages because the development effort per language is large.

ONTS: "Optima" News Translation System

no code implementations EACL 2012 Marco Turchi, Martin Atkinson, Alastair Wilcox, Brett Crawley, Stefano Bucci, Ralf Steinberger, Erik van der Goot

We propose a real-time machine translation system that allows users to select a news category and to translate the related live news articles from Arabic, Czech, Danish, Farsi, French, German, Italian, Polish, Portuguese, Spanish and Turkish into English.

Machine Translation Translation

Sentiment Analysis in the News

no code implementations24 Sep 2013 Alexandra Balahur, Ralf Steinberger, Mijail Kabadjov, Vanni Zavarella, Erik van der Goot, Matina Halkia, Bruno Pouliquen, Jenya Belyaeva

We identified three subtasks that need to be addressed: definition of the target; separation of the good and bad news content from the good and bad sentiment expressed on the target; and analysis of clearly marked opinion that is expressed explicitly, not needing interpretation or the use of world knowledge.

Opinion Mining Sentiment Analysis +1

Acronym recognition and processing in 22 languages

no code implementations RANLP 2013 Maud Ehrmann, Leonida della Rocca, Ralf Steinberger, Hristo Tanev

We are presenting work on recognising acronyms of the form Long-Form (Short-Form) such as "International Monetary Fund (IMF)" in millions of news articles in twenty-two languages, as part of our more general effort to recognise entities and their variants in news text and to use them for the automatic analysis of the news, including the linking of related news across languages.

JRC-Names: A freely available, highly multilingual named entity resource

no code implementations24 Sep 2013 Ralf Steinberger, Bruno Pouliquen, Mijail Kabadjov, Erik van der Goot

This paper describes a new, freely available, highly multilingual named entity resource for person and organisation names that has been compiled over seven years of large-scale multilingual news analysis combined with Wikipedia mining, resulting in 205, 000 per-son and organisation names plus about the same number of spelling variants written in over 20 different scripts and in many more languages.

Machine Translation Morphological Inflection +4

An introduction to the Europe Media Monitor family of applications

no code implementations20 Sep 2013 Ralf Steinberger, Bruno Pouliquen, Erik van der Goot

In the European Union with its 23 official languages, it is particularly important to cover media reports in many languages in order to capture the complementary news content published in the different countries.

JRC EuroVoc Indexer JEX - A freely available multi-label categorisation tool

no code implementations LREC 2012 Ralf Steinberger, Mohamed Ebrahim, Marco Turchi

EuroVoc (2012) is a highly multilingual thesaurus consisting of over 6, 700 hierarchically organised subject domains used by European Institutions and many authorities in Member States of the European Union (EU) for the classification and retrieval of official documents.

Classification Clustering +4

DGT-TM: A freely Available Translation Memory in 22 Languages

no code implementations LREC 2012 Ralf Steinberger, Andreas Eisele, Szymon Klocek, Spyridon Pilos, Patrick Schlüter

The European Commission's (EC) Directorate General for Translation, together with the EC's Joint Research Centre, is making available a large translation memory (TM; i. e. sentences and their professionally produced translations) covering twenty-two official European Union (EU) languages and their 231 language pairs.

Clustering General Classification +5

Cannot find the paper you are looking for? You can Submit a new open access paper.