Finally, we create a demo video for Trankit at: https://youtu. be/q0KGP3zGjGc.
This paper explores the difficulties of annotating transcribed spoken Dutch-Frisian code-switch utterances into Universal Dependencies.
The sentence is a fundamental unit of text processing.
As a solution, we present Mukayese, a set of NLP benchmarks for the Turkish language that contains several NLP tasks.
We present SLATE, a sequence labeling approach for extracting tasks from free-form content such as digitally handwritten (or "inked") notes on a virtual whiteboard.
Parsing spoken dialogue presents challenges that parsing text does not, including a lack of clear sentence boundaries.
Many NLP pipelines split text into sentences as one of the crucial preprocessing steps.
While large language models (LLMs) have made considerable advancements in understanding and generating unstructured text, their application in structured data remains underexplored.