We present a semi-supervised learning framework based on graph embeddings. Given a graph between instances, we train an embedding for each instance to jointly predict the class label and the neighborhood context in the graph.
#2 best model for Node Classification on NELL
Timely analysis of cyber-security information necessitates automated information extraction from unstructured text. While state-of-the-art extraction methods produce extremely accurate results, they require ample training data, which is generally unavailable for specialized applications, such as detecting security related entities; moreover, manual annotation of corpora is very costly and often not a viable solution.
This work addresses challenges arising from extracting entities from textual data, including the high cost of data annotation, model accuracy, selecting appropriate evaluation criteria, and the overall quality of annotation. We present a framework that integrates Entity Set Expansion (ESE) and Active Learning (AL) to reduce the annotation cost of sparse data and provide an online evaluation method as feedback.
Output-agreement mechanisms such as ESP Game have been widely used in human computation to obtain reliable human-generated labels. In this paper, we argue that a "time-limited" output-agreement mechanism can be used to create a fast and robust crowd-powered component in interactive systems, particularly dialogue systems, to extract key information from user utterances on the fly.
This paper deals with the entity extraction task (named entity recognition) of a text mining process that aims at unveiling non-trivial semantic structures, such as relationships and interaction between entities or communities. Evaluation of the efficacy of the entity extraction method on several texts written in Portuguese indicates a considerable improvement on $recall$ and $F_1$ measures.