Search Results for author: Shabnam Behzad

Found 9 papers, 6 papers with code

AMALGUM -- A Free, Balanced, Multilayer English Web Corpus

1 code implementation LREC 2020 Luke Gessler, Siyao Peng, Yang Liu, YIlun Zhu, Shabnam Behzad, Amir Zeldes

We present a freely available, genre-balanced English web corpus totaling 4M tokens and featuring a large number of high-quality automatic annotation layers, including dependency trees, non-named entity annotations, coreference resolution, and discourse trees in Rhetorical Structure Theory.

coreference-resolution

GENTLE: A Genre-Diverse Multilayer Challenge Set for English NLP and Linguistic Evaluation

1 code implementation3 Jun 2023 Tatsuya Aoyama, Shabnam Behzad, Luke Gessler, Lauren Levine, Jessica Lin, Yang Janet Liu, Siyao Peng, YIlun Zhu, Amir Zeldes

We evaluate state-of-the-art NLP systems on GENTLE and find severe degradation for at least some genres in their performance on all tasks, which indicates GENTLE's utility as an evaluation dataset for NLP systems.

coreference-resolution Dependency Parsing +2

MultiMUC: Multilingual Template Filling on MUC-4

1 code implementation29 Jan 2024 William Gantt, Shabnam Behzad, Hannah Youngeun An, Yunmo Chen, Aaron Steven White, Benjamin Van Durme, Mahsa Yarmohammadi

We introduce MultiMUC, the first multilingual parallel corpus for template filling, comprising translations of the classic MUC-4 template filling benchmark into five languages: Arabic, Chinese, Farsi, Korean, and Russian.

Machine Translation Translation

A Cross-Genre Ensemble Approach to Robust Reddit Part of Speech Tagging

1 code implementation LREC 2020 Shabnam Behzad, Amir Zeldes

However, when these models are applied to other corpora with different genres, and especially user-generated data from the Web, we see substantial drops in performance.

Part-Of-Speech Tagging

Cannot find the paper you are looking for? You can Submit a new open access paper.