1 code implementation • 1 Feb 2021 • Lijun Lyu, Maria Koutraki, Martin Krickl, Besnik Fetahu
Optical character recognition (OCR) is crucial for a deeper access to historical collections.
Optical Character Recognition Optical Character Recognition (OCR)
1 code implementation • 28 Feb 2019 • Miriam Redi, Besnik Fetahu, Jonathan Morgan, Dario Taraborelli
In this paper, we aim to provide an empirical characterization of the reasons why and how Wikipedia cites external sources to comply with its own verifiability guidelines.
no code implementations • 20 Apr 2018 • Besnik Fetahu
Even in cases where citations are provided, there are no explicit indicators for the span of a citation for a given piece of text.
no code implementations • EMNLP 2017 • Besnik Fetahu, Katja Markert, Avishek Anand
For a Wikipedia article, determining the \emph{citation span} of a citation, i. e. what content is covered by a citation, is important as it helps decide for which content citations are still missing.
no code implementations • 30 Mar 2017 • Besnik Fetahu, Katja Markert, Wolfgang Nejdl, Avishek Anand
An important editing policy in Wikipedia is to provide citations for added statements in Wikipedia pages, where statements can be arbitrary pieces of text, ranging from a sentence to a paragraph.
no code implementations • 30 Mar 2017 • Besnik Fetahu, Katja Markert, Avishek Anand
We propose a two-stage supervised approach for suggesting news articles to entity pages for a given state of Wikipedia.
no code implementations • 14 Nov 2018 • Christoph Hube, Besnik Fetahu
Biased language is introduced through the presence of inflammatory words or phrases, or statements that may be incorrect or one-sided, thus violating such consensus.
no code implementations • 3 Feb 2020 • Vasileios Iosifidis, Besnik Fetahu, Eirini Ntoutsi
In the post-processing step, we tackle the problem of class overlapping by shifting the decision boundary in the direction of fairness.
no code implementations • ACL 2022 • Nachshon Cohen, Amit Portnoy, Besnik Fetahu, Amir Ingber
BERT based ranking models have achieved superior performance on various information retrieval tasks.
no code implementations • SemEval (NAACL) 2022 • Shervin Malmasi, Anjie Fang, Besnik Fetahu, Sudipta Kar, Oleg Rokhlenko
Divided into 13 tracks, the task focused on methods to identify complex named entities (like names of movies, products and groups) in 11 languages in both monolingual and multi-lingual scenarios.
no code implementations • NAACL 2022 • Besnik Fetahu, Anjie Fang, Oleg Rokhlenko, Shervin Malmasi
Named entity recognition (NER) in a real-world setting remains challenging and is impacted by factors like text genre, corpus quality, and data availability.
Cross-Domain Named Entity Recognition Cross-Lingual Transfer +4
no code implementations • COLING 2022 • Shervin Malmasi, Anjie Fang, Besnik Fetahu, Sudipta Kar, Oleg Rokhlenko
We present MultiCoNER, a large multilingual dataset for Named Entity Recognition that covers 3 domains (Wiki sentences, questions, and search queries) across 11 languages, as well as multilingual and code-mixing subsets.
no code implementations • 27 Oct 2022 • Zhiyu Chen, Jie Zhao, Anjie Fang, Besnik Fetahu, Oleg Rokhlenko, Shervin Malmasi
Furthermore, human evaluation shows that our method can generate more accurate and detailed rewrites when compared to human annotations.
no code implementations • 11 May 2023 • Besnik Fetahu, Sudipta Kar, Zhiyu Chen, Oleg Rokhlenko, Shervin Malmasi
The task highlights the need for future research on improving NER robustness on noisy data containing complex entities.
Multilingual Named Entity Recognition named-entity-recognition +3
no code implementations • 27 May 2023 • Pedro Faustini, Zhiyu Chen, Besnik Fetahu, Oleg Rokhlenko, Shervin Malmasi
Spoken Question Answering (QA) is a key feature of voice assistants, usually backed by multiple QA systems.
no code implementations • 6 Jun 2023 • Zhiyu Chen, Jason Choi, Besnik Fetahu, Oleg Rokhlenko, Shervin Malmasi
We propose an intent-aware FAQ retrieval system consisting of (1) an intent classifier that predicts when a user's information need can be answered by an FAQ; (2) a reformulation model that rewrites a query into a natural question.
no code implementations • 20 Oct 2023 • Besnik Fetahu, Zhiyu Chen, Sudipta Kar, Oleg Rokhlenko, Shervin Malmasi
We present MULTICONER V2, a dataset for fine-grained Named Entity Recognition covering 33 entity classes across 12 languages, in both monolingual and multilingual settings.
no code implementations • 25 Oct 2023 • Besnik Fetahu, Zhiyu Chen, Oleg Rokhlenko, Shervin Malmasi
E-commerce product catalogs contain billions of items.
no code implementations • 25 Oct 2023 • Besnik Fetahu, Pedro Faustini, Giuseppe Castellucci, Anjie Fang, Oleg Rokhlenko, Shervin Malmasi
Using a new dataset of 6681 input questions and human written hints, we evaluated the models with automatic metrics and human evaluation.
no code implementations • 18 Jan 2024 • Besnik Fetahu, Tejas Mehta, Qun Song, Nikhita Vedula, Oleg Rokhlenko, Shervin Malmasi
E-commerce customers frequently seek detailed product information for purchase decisions, commonly contacting sellers directly with extended queries.
no code implementations • 18 Jan 2024 • Lingbo Mo, Besnik Fetahu, Oleg Rokhlenko, Shervin Malmasi
Yes/No or polar questions represent one of the main linguistic question categories.
no code implementations • 9 Apr 2024 • Besnik Fetahu, Nachshon Cohen, Elad Haramaty, Liane Lewin-Eytan, Oleg Rokhlenko, Shervin Malmasi
We focus on the domain of e-commerce, namely in identifying Shopping Product Questions (SPQs), where the user asking a product-related question may have an underlying shopping need.