LREC 2016

An Open Corpus for Named Entity Recognition in Historic Newspapers

LREC 2016 EuropeanaNewspapers/ner-corpora

The availability of openly available textual datasets ({``}corpora{''}) with highly accurate manual annotations ({``}gold standard{''}) of named entities (e. g. persons, locations, organizations, etc.)

NAMED ENTITY RECOGNITION

MultiVec: a Multilingual and Multilevel Representation Learning Toolkit for NLP

LREC 2016 eske/multivec

We present MultiVec, a new toolkit for computing continuous representations for text at different granularity levels (word-level or sequences of words).

DOCUMENT CLASSIFICATION REPRESENTATION LEARNING SENTIMENT ANALYSIS

JATE 2.0: Java Automatic Term Extraction with Apache Solr

LREC 2016 ziqizhang/jate

Automatic Term Extraction (ATE) or Recognition (ATR) is a fundamental processing step preceding many complex knowledge engineering tasks.

A Multilingual, Multi-style and Multi-granularity Dataset for Cross-language Textual Similarity Detection

LREC 2016 FerreroJeremy/Cross-Language-Dataset

In this paper we describe our effort to create a dataset for the evaluation of cross-language textual similarity detection.

C4Corpus: Multilingual Web-size Corpus with Free License

LREC 2016 dkpro/dkpro-c4corpus

Large Web corpora containing full documents with permissive licenses are crucial for many NLP tasks.

A Semantically Compositional Annotation Scheme for Time Normalization

LREC 2016 bethard/timenorm

We present a new annotation scheme for normalizing time expressions, such as {``}three days ago{''}, to computer-readable forms, such as 2016-03-07.

SEMANTIC COMPOSITION

Cohere: A Toolkit for Local Coherence

LREC 2016 karins/CoherenceFramework

We describe COHERE, our coherence toolkit which incorporates various complementary models for capturing and measuring different aspects of text coherence.

Collecting Resources in Sub-Saharan African Languages for Automatic Speech Recognition: a Case Study of Wolof

LREC 2016 besacier/ALFFA_PUBLIC

This article presents the data collected and ASR systems developped for 4 sub-saharan african languages (Swahili, Hausa, Amharic and Wolof).

SPEECH RECOGNITION

Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages

LREC 2016 UCREL/Multilingual-USAS

Lexical coverage is an important factor concerning the quality of the lexicons and the performance of the corpus annotation tools, and in this experiment we focus on evaluating the lexical coverage achieved by the multilingual lexicons and semantic annotation tools based on them.

FLAT: Constructing a CLARIN Compatible Home for Language Resources

LREC 2016 TheLanguageArchive/FLAT

This paper describes these and some additional ones posed by the authors{'} home institutions.