Sentence segmentation
19 papers with code • 1 benchmarks • 3 datasets
Most implemented papers
Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing
Finally, we create a demo video for Trankit at: https://youtu. be/q0KGP3zGjGc.
Creating a Universal Dependencies Treebank of Spoken Frisian-Dutch Code-switched Data
This paper explores the difficulties of annotating transcribed spoken Dutch-Frisian code-switch utterances into Universal Dependencies.
A unified approach to sentence segmentation of punctuated text in many languages
The sentence is a fundamental unit of text processing.
Mukayese: Turkish NLP Strikes Back
As a solution, we present Mukayese, a set of NLP benchmarks for the Turkish language that contains several NLP tasks.
SLATE: A Sequence Labeling Approach for Task Extraction from Free-form Inked Content
We present SLATE, a sequence labeling approach for extracting tasks from free-form content such as digitally handwritten (or "inked") notes on a virtual whiteboard.
Prosodic features improve sentence segmentation and parsing
Parsing spoken dialogue presents challenges that parsing text does not, including a lack of clear sentence boundaries.
Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation
Many NLP pipelines split text into sentences as one of the crucial preprocessing steps.
KG-GPT: A General Framework for Reasoning on Knowledge Graphs Using Large Language Models
While large language models (LLMs) have made considerable advancements in understanding and generating unstructured text, their application in structured data remains underexplored.