TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Text Segmentation	SPMRL Hebrew segmentation data	RFTokenizer	F-Score	97.08	# 1
Text Segmentation	Wiki5K Hebrew segmentation	RFTokenizer	F-Score	96.35	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-characterwise-windowed-approach-to-hebrew/text-segmentation-on-spmrl-hebrew)](https://paperswithcode.com/sota/text-segmentation-on-spmrl-hebrew?p=a-characterwise-windowed-approach-to-hebrew)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-characterwise-windowed-approach-to-hebrew/text-segmentation-on-wiki5k-hebrew)](https://paperswithcode.com/sota/text-segmentation-on-wiki5k-hebrew?p=a-characterwise-windowed-approach-to-hebrew)`

A Characterwise Windowed Approach to Hebrew Morphological Segmentation

WS 2018 · Amir Zeldes ·

This paper presents a novel approach to the segmentation of orthographic word forms in contemporary Hebrew, focusing purely on splitting without carrying out morphological analysis or disambiguation. Casting the analysis task as character-wise binary classification and using adjacent character and word-based lexicon-lookup features, this approach achieves over 98% accuracy on the benchmark SPMRL shared task data for Hebrew, and 97% accuracy on a new out of domain Wikipedia dataset, an improvement of ~4% and 5% over previous state of the art performance.

PDF Abstract WS 2018 PDF WS 2018 Abstract

Code

Add Remove Mark official

amir-zeldes/RFTokenizer official

Tasks

Add Remove

Binary Classification

General Classification

Morphological Analysis

Text Segmentation

Datasets

Introduced in the Paper:

Wiki5K Hebrew segmentation

Used in the Paper:

SPMRL Hebrew segmentation data

Results from the Paper

Edit

Ranked #1 on Text Segmentation on Wiki5K Hebrew segmentation

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Result	Benchmark
Text Segmentation	SPMRL Hebrew segmentation data	RFTokenizer	F-Score	97.08	# 1		Compare
Text Segmentation	Wiki5K Hebrew segmentation	RFTokenizer	F-Score	96.35	# 1		Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

A Characterwise Windowed Approach to Hebrew Morphological Segmentation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove