no code implementations • EACL (Hackashop) 2021 • Senja Pollak, Marko Robnik-Šikonja, Matthew Purver, Michele Boggia, Ravi Shekhar, Marko Pranjić, Salla Salmela, Ivar Krustok, Tarmo Paju, Carl-Gustav Linden, Leo Leppänen, Elaine Zosa, Matej Ulčar, Linda Freienthal, Silver Traat, Luis Adrián Cabrera-Diego, Matej Martinc, Nada Lavrač, Blaž Škrlj, Martin Žnidaršič, Andraž Pelicon, Boshko Koloski, Vid Podpečan, Janez Kranjc, Shane Sheehan, Emanuela Boros, Jose G. Moreno, Antoine Doucet, Hannu Toivonen
This paper presents tools and data sources collected and released by the EMBEDDIA project, supported by the European Union’s Horizon 2020 research and innovation program.
no code implementations • JEP/TALN/RECITAL 2021 • Emanuela Boros, Romaric Besançon, Olivier Ferret, Brigitte Grau
Cet article aborde la tâche de détection d’événements, visant à identifier et catégoriser les mentions d’événements dans les textes.
1 code implementation • SemEval (NAACL) 2022 • Elaine Zosa, Emanuela Boros, Boshko Koloski, Lidia Pivovarova
In this paper, we present the participation of the EMBEDDIA team in the SemEval-2022 Task 8 (Multilingual News Article Similarity).
no code implementations • SemEval (NAACL) 2022 • Emanuela Boros, Carlos-Emiliano González-Gallardo, Jose Moreno, Antoine Doucet
Also, we consider that using additional contexts from the training set could improve the performance of a NER on short texts.
no code implementations • JEP/TALN/RECITAL 2022 • Stephen Mutuvi, Emanuela Boros, Antoine Doucet, Adam Jatowt, Gaël Lejeune, Moses Odeo
Dans cet article, nous explorons plusieurs hypothèses concernant les facteurs qui pourraient avoir une influence sur les performances d’un système d’extraction d’événements épidémiologiques dans un scénario multilingue à faibles ressources : le type de modèle pré-entraîné, la qualité du tokenizer ainsi que les caractéristiques des entités à extraire.
no code implementations • JEP/TALN/RECITAL 2022 • Emanuela Boros, Jose Moreno, Antoine Doucet
Dans cet article, nous abordons un paradigme récent et peu étudié pour la tâche de détection d’événements en la présentant comme un problème de question-réponse avec possibilité de réponses multiples et le support d’entités.
1 code implementation • 25 Sep 2024 • Emanuela Boros, Maud Ehrmann
This paper investigates the presence of OCR-sensitive neurons within the Transformer architecture and their influence on named entity recognition (NER) performance on historical documents.
1 code implementation • 30 Mar 2023 • Carlos-Emiliano González-Gallardo, Emanuela Boros, Nancy Girdhar, Ahmed Hamdi, Jose G. Moreno, Antoine Doucet
Large language models (LLMs) have been leveraged for several years now, obtaining state-of-the-art performance in recognizing entities from modern documents.
1 code implementation • 20 Jan 2023 • Nhu Khoa Nguyen, Thierry Delahaut, Emanuela Boros, Antoine Doucet, Gaël Lejeune
Identifying and exploring emerging trends in the news is becoming more essential than ever with many changes occurring worldwide due to the global health crises.
1 code implementation • 14 Apr 2021 • Emanuela Boros, Jose G. Moreno, Antoine Doucet
In this paper, we propose a recent and under-researched paradigm for the task of event detection (ED) by casting it as a question-answering (QA) problem with the possibility of multiple answers and the support of entities.
no code implementations • 13 Apr 2021 • Emanuela Boros, Antoine Doucet
This paper summarizes the participation of the Laboratoire Informatique, Image et Interaction (L3i laboratory) of the University of La Rochelle in the Recognizing Ultra Fine-grained Entities (RUFES) track within the Text Analysis Conference (TAC) series of evaluation workshops.
no code implementations • COLING 2020 • Stephen Mutuvi, Emanuela Boros, Antoine Doucet, Adam Jatowt, Ga{\"e}l Lejeune, Moses Odeo
We conduct a comparative study of different machine and deep learning text classification models using a dataset comprising news articles related to epidemic outbreaks from six languages, four low-resourced and two high-resourced, in order to analyze the influence of the nature of the language, the structure of the document, and the size of the data.
1 code implementation • CONLL 2020 • Emanuela Boros, Ahmed Hamdi, Elvys Linhares Pontes, Luis Adri{\'a}n Cabrera-Diego, Jose G. Moreno, Nicolas Sidere, Antoine Doucet
This paper tackles the task of named entity recognition (NER) applied to digitized historical texts obtained from processing digital images of newspapers using optical character recognition (OCR) techniques.