Finding records in a data set that refer to the same entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)
Named entity recognition (NER) is a widely applicable natural language processing task and building block of question answering, topic modeling, information retrieval, etc.
Ranked #1 on Named Entity Recognition on JNLPBA
We report its performance on candidate selection in the context of the downstream task of toponym resolution, both on existing datasets and on a new manually-annotated resource of nineteenth-century English OCR'd text.
Entity resolution (ER; also known as record linkage or de-duplication) is the process of merging noisy databases, often in the absence of unique identifiers.
We evaluate STANCE's ability to detect whether two strings can refer to the same entity--a task we term alias detection.
A limitation of such benchmarks is that they typically come with their own task definition and it can be difficult to leverage them for complex integration pipelines.
Named Entity Disambiguation algorithms typically learn a single model for all target entities.
Knowledge bases (KBs) store rich yet heterogeneous entities and facts.
We tackle optimization of weighted set packing by relaxing integrality in our ILP formulation.
One of the key steps in language resource creation is the identification of the text segments to be annotated, or markables, which depending on the task may vary from nominal chunks for named entity resolution to (potentially nested) noun phrases in coreference resolution (or mentions) to larger text segments in text segmentation.