Text Segmentation
26 papers with code • 2 benchmarks • 4 datasets
Text segmentation deals with the correct division of a document into semantically coherent blocks.
Datasets
Most implemented papers
CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases
We propose a novel domain-independent framework, called CoType, that runs a data-driven text segmentation algorithm to extract entity mentions, and jointly embeds entity mentions, relation mentions, text features and type labels into two low-dimensional spaces (for entity and relation mentions respectively), where, in each space, objects whose types are close will also have similar representations.
Sequence Modeling via Segmentations
The probability of a segmented sequence is calculated as the product of the probabilities of all its segments, where each segment is modeled using existing tools such as recurrent neural networks.
Text Segmentation as a Supervised Learning Task
Text segmentation, the task of dividing a document into contiguous segments based on its semantic structure, is a longstanding challenge in language understanding.
Text Segmentation based on Semantic Word Embeddings
We explore the use of semantic word embeddings in text segmentation algorithms, including the C99 segmentation algorithm and new algorithms inspired by the distributed word vector representation.
Khmer Word Segmentation Using Conditional Random Fields
The trained CRF segmenter was compared empirically to a baseline approach based on maximum matching that used a dictionary extracted from the manually segmented corpus.
An efficient way for segmentation of Bangla characters in printed document using curved scanning
The preeminent reason for poor output in Optical Character Recognition (OCR) for Bangla text is introduced by segmentation related error.
A Characterwise Windowed Approach to Hebrew Morphological Segmentation
This paper presents a novel approach to the segmentation of orthographic word forms in contemporary Hebrew, focusing purely on splitting without carrying out morphological analysis or disambiguation.
Attention-based Neural Text Segmentation
Text segmentation plays an important role in various Natural Language Processing (NLP) tasks like summarization, context understanding, document indexing and document noise removal.