13 dataset results for segmentation AND French

Multi-Spectral Leaf Segmentation (Multi-Spectral Leaf Segmentation For Crop/Weed Identification)

This dataset were acquired with the Airphen (Hyphen, Avignon, France) six-band multi-spectral camera configured using the 450/570/675/710/730/850 nm bands with a 10 nm FWHM. And acquired on the site of INRAe in Montoldre (Allier, France, at 46°20'30.3"N 3°26'03.6"E) within the framework of the “RoSE challenge” founded by the French National Research Agency (ANR). Images contains bean, with various natural weeds (yarrows, amaranth, geranium, plantago, etc) and sowed ones (mustards, goosefoots, mayweed and ryegrass) with very distinct characteristics in terms of illumination (shadow, morning, evening, full sun, cloudy, rain, ...) The ground truth is defined for each images with polygons around leafs boundaries: In addition, each polygons are labeled into crop or weed. (2020-06-11)

0 PAPER • NO BENCHMARKS YET

Deep Indices

Deep Indices (multi-spectral leaf/vegetation segmentation)

This dataset inclue multi-spectral acquisition of vegetation for the conception of new DeepIndices. The images were acquired with the Airphen (Hyphen, Avignon, France) six-band multi-spectral camera configured using the 450/570/675/710/730/850 nm bands with a 10 nm FWHM. The dataset were acquired on the site of INRAe in Montoldre (Allier, France, at 46°20'30.3"N 3°26'03.6"E) within the framework of the “RoSE challenge” founded by the French National Research Agency (ANR) and in Dijon (Burgundy, France, at 47°18'32.5"N 5°04'01.8"E) within the site of AgroSup Dijon. Images of bean and corn, containing various natural weeds (yarrows, amaranth, geranium, plantago, etc) and sowed ones (mustards, goosefoots, mayweed and ryegrass) with very distinct characteristics in terms of illumination (shadow, morning, evening, full sun, cloudy, rain, ...) were acquired in top-down view at 1.8 meter from the ground. (2020-05-01)

1 PAPER • 1 BENCHMARK

DISRPT2019

DISRPT2019 (DISRPT2019 shared task on Discourse Unit Segmentation and Connective Detection)

The DISRPT 2019 workshop introduces the first iteration of a cross-formalism shared task on discourse unit segmentation. Since all major discourse parsing frameworks imply a segmentation of texts into segments, learning segmentations for and from diverse resources is a promising area for converging methods and insights. Because different corpora, languages and frameworks use different guidelines for segmentation, the shared task is meant to promote design of flexible methods for dealing with various guidelines, and help

4 PAPERS • NO BENCHMARKS YET

DISRPT2021

DISRPT2021 (DISRPT2021 shared task on Discourse Unit Segmentation, Connective Detection and Discourse Relation Classification)

The DISRPT 2021 shared task, co-located with CODI 2021 at EMNLP, introduces the second iteration of a cross-formalism shared task on discourse unit segmentation and connective detection, as well as the

3 PAPERS • NO BENCHMARKS YET

AVSpeech

…The segments are of varying length, between 3 and 10 seconds long, and in each clip the only visible face in the video and audible sound in the soundtrack belong to a single speaking person. In total, the dataset contains roughly 4700 hours of video segments with approximately 150,000 distinct speakers, spanning a wide variety of people, languages and face poses.

35 PAPERS • NO BENCHMARKS YET

WASABI

…lyrics encode an important part of the semantics of a song, the authors focus on the description of the methods they proposed to extract relevant information from the lyrics, such as their structure segmentation can be exploited by music search engines and music professionals (e.g. journalists, radio presenters) to better handle large collections of lyrics, allowing an intelligent browsing, categorization and segmentation

0 PAPER • NO BENCHMARKS YET

Tilde MODEL Corpus

Tilde MODEL Corpus (Tilde Multilingual Open Data for European Languages)

…It contains over 10M segments of multilingual open data. The data has been collected from sites allowing free use and reuse of its content, as well as from Public Sector web sites.

2 PAPERS • NO BENCHMARKS YET

Multilingual Dataset for Training and Evaluating Diacritics Restoration Systems

…Data are segmented into sentences which are further word tokenized.

2 PAPERS • 12 BENCHMARKS

GATITOS

GATITOS (Google's Additional Translations Into Tail-languages: Often Short)

…This dataset consists in 4,000 English segments (4,500 tokens) that have been translated into each of 26 low-resource languages, as well as three higher-resource pivot languages (es, fr, hi).

1 PAPER • NO BENCHMARKS YET

ICDAR 2021

…Géraud Official competition website: https://icdar21-mapseg.github.io/ This is the dataset of the ICDAR 2021 Competition on Historical Map Segmentation (“MapSeg”). The general pipeline involves multiples stages; we list some essential ones here: segment map content: locate the area of the image which contains map content; extract map object from different layers Task 2: “Segment Map Area” This tasks is the equivalent of text area detection for OCR: given the image of a complete map sheet, you need to segment the area which contains map content. We decided to segment each of those regions as closely as possible. Expected output for this task is a binary mask indicating for each pixel whether it belongs to the map area or not.

1 PAPER • NO BENCHMARKS YET

Jamendo Corpus

…Segments of each song are annotated as “voice” (sung or spoken) or “no-voice”. The songs constitute a total of about 6 hours of music.

3 PAPERS • NO BENCHMARKS YET

BIMCV COVID-19

…In addition, 23 images were annotated by a team of expert radiologists to include semantic segmentation of radiographic findings.

8 PAPERS • NO BENCHMARKS YET

SIMARA (SIMARA: a database for key-value information extraction from full-page handwritten documents)

…The localization of each field is not available in such a way that this dataset encourages research on segmentation-free systems for information extraction.

1 PAPER • 2 BENCHMARKS

Datasets

13 dataset results for segmentation AND French