no code implementations • ACL 2022 • Amit Seker, Elron Bandel, Dan Bareket, Idan Brusilovsky, Refael Greenfeld, Reut Tsarfaty
First, so far, Hebrew resources for training large language models are not of the same magnitude as their English counterparts.
no code implementations • 28 Nov 2022 • Eylon Gueta, Avi Shmidman, Shaltiel Shmidman, Cheyn Shmuel Shmidman, Joshua Guedalia, Moshe Koppel, Dan Bareket, Amit Seker, Reut Tsarfaty
We perform a contrastive analysis of this model against all previous Hebrew PLMs (mBERT, heBERT, AlephBERT) and assess the effects of larger vocabularies on task performance.
Ranked #1 on Named Entity Recognition (NER) on NEMO-Corpus
2 code implementations • 8 Apr 2021 • Amit Seker, Elron Bandel, Dan Bareket, Idan Brusilovsky, Refael Shaked Greenfeld, Reut Tsarfaty
Second, there are no accepted tasks and benchmarks to evaluate the progress of Hebrew PLMs on.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Amit Seker, Reut Tsarfaty
Neural MD may be addressed as a simple pipeline, where segmentation is followed by sequence tagging, or as an end-to-end model, predicting morphemes from raw tokens.
no code implementations • ACL 2020 • Reut Tsarfaty, Dan Bareket, Stav Klein, Amit Seker
It has been exactly a decade since the first establishment of SPMRL, a research initiative unifying multiple research efforts to address the peculiar challenges of Statistical Parsing for Morphologically-Rich Languages (MRLs). Here we reflect on parsing MRLs in that decade, highlight the solutions and lessons learned for the architectural, modeling and lexical challenges in the pre-neural era, and argue that similar challenges re-emerge in neural architectures for MRLs.
no code implementations • IJCNLP 2019 • Reut Tsarfaty, Amit Seker, Shoval Sadde, Stav Klein
For languages with simple morphology, such as English, automatic annotation pipelines such as spaCy or Stanford's CoreNLP successfully serve projects in academia and the industry.
no code implementations • TACL 2019 • Amir More, Amit Seker, Victoria Basmova, Reut Tsarfaty
In standard NLP pipelines, morphological analysis and disambiguation (MA{\&}D) precedes syntactic and semantic downstream tasks.
no code implementations • WS 2018 • Shoval Sade, Amit Seker, Reut Tsarfaty
The Hebrew treebank (HTB), consisting of 6221 morpho-syntactically annotated newspaper sentences, has been the only resource for training and validating statistical parsers and taggers for Hebrew, for almost two decades now.
no code implementations • CONLL 2018 • Amit Seker, Amir More, Reut Tsarfaty
We present the contribution of the ONLP lab at the Open University of Israel to the UD shared task on multilingual parsing from raw text to Universal Dependencies.