1 code implementation • sdp (COLING) 2022 • Óscar E. Mendoza, Wojciech Kusa, Alaa El-Ebshihy, Ronin Wu, David Pride, Petr Knoth, Drahomira Herrmannova, Florina Piroi, Gabriella Pasi, Allan Hanbury
We present a new gold-standard dataset and a benchmark for the Research Theme Identification task, a sub-task of the Scholarly Knowledge Graph Generation shared task, at the 3rd Workshop on Scholarly Document Processing.
no code implementations • WNUT (ACL) 2021 • Johannes Bogensperger, Sven Schlarb, Allan Hanbury, Gábor Recski
We present DreamDrug, a crowdsourced dataset for detecting mentions of drugs in noisy user-generated item listings from darknet markets.
1 code implementation • BioNLP (ACL) 2022 • Wojciech Kusa, Georgios Peikos, Óscar Espitia, Allan Hanbury, Gabriella Pasi
We propose two answer localization approaches that use only textual information extracted from the video.
no code implementations • 22 Nov 2024 • Moritz Staudinger, Wojciech Kusa, Florina Piroi, Aldo Lipani, Allan Hanbury
This paper presents an extensive study of Boolean query generation using LLMs for systematic reviews, reproducing and extending the work of Wang et al. and Alaniz et al. Our study investigates the replicability and reliability of results achieved using ChatGPT and compares its performance with open-source alternatives like Mistral and Zephyr to provide a more comprehensive analysis of LLMs for query generation.
no code implementations • 7 Nov 2024 • Varvara Arzt, Allan Hanbury
This paper investigates the transparency in the creation of benchmarks and the use of leaderboards for measuring progress in NLP, with a focus on the relation extraction (RE) task.
no code implementations • 12 Jun 2024 • Pia Pachinger, Janis Goldzycher, Anna Maria Planitzer, Wojciech Kusa, Allan Hanbury, Julia Neidhardt
Model interpretability in toxicity detection greatly profits from token-level annotations.
1 code implementation • NeurIPS 2023 • Wojciech Kusa, Oscar E. Mendoza, Matthias Samwald, Petr Knoth, Allan Hanbury
Systematic literature reviews (SLRs) play an essential role in summarising, synthesising and validating scientific evidence.
1 code implementation • 12 Sep 2023 • Sophia Althammer, Guido Zuccon, Sebastian Hofstätter, Suzan Verberne, Allan Hanbury
We further find that gains provided by AL strategies come at the expense of more assessments (thus higher annotation costs) and AL strategies underperform random selection when comparing effectiveness given a fixed annotation cost.
1 code implementation • 4 Sep 2023 • Wojciech Kusa, Petr Knoth, Allan Hanbury
To this end, we developed CRUISE-Screening, a web-based application for conducting living literature reviews - a type of literature review that is continuously updated to reflect the latest research in a particular field.
no code implementations • 1 Jul 2023 • Wojciech Kusa, Óscar E. Mendoza, Petr Knoth, Gabriella Pasi, Allan Hanbury
Our approach involves two key components in a pipeline-based model: (i) a data enrichment technique for enhancing both queries and documents during the first retrieval stage, and (ii) a novel re-ranking schema that uses a Transformer network in a setup adapted to this task by leveraging the structure of the CT documents.
no code implementations • 30 Jun 2023 • Wojciech Kusa, Guido Zuccon, Petr Knoth, Allan Hanbury
We find that accounting for the difference in review outcomes leads to a different assessment of the quality of a system than if traditional evaluation measures were used.
no code implementations • 17 Apr 2023 • Tobias Fink, Gabor Recski, Wojciech Kusa, Allan Hanbury
We discuss our experiments for COLIEE Task 1, a court case retrieval competition using cases from the Federal Court of Canada.
1 code implementation • 14 Aug 2022 • Sophia Althammer, Sebastian Hofstätter, Suzan Verberne, Allan Hanbury
Robust test collections are crucial for Information Retrieval research.
no code implementations • 26 Jun 2022 • Sebastian Hofstätter, Nick Craswell, Bhaskar Mitra, Hamed Zamani, Allan Hanbury
Recently, several dense retrieval (DR) models have demonstrated competitive performance to term-based retrieval that are ubiquitous in search systems.
no code implementations • 24 Mar 2022 • Sebastian Hofstätter, Omar Khattab, Sophia Althammer, Mete Sertkan, Allan Hanbury
Recent progress in neural information retrieval has demonstrated large gains in effectiveness, while often sacrificing the efficiency and interpretability of the neural model compared to classical approaches.
no code implementations • 9 Mar 2022 • Georg Heiler, Thassilo Gadermaier, Thomas Haider, Allan Hanbury, Peter Filzmoser
Good quality network connectivity is ever more important.
1 code implementation • 19 Jan 2022 • Wojciech Kusa, Allan Hanbury, Petr Knoth
In this work, we conduct a replicability study of the first two deep learning papers for citation screening and evaluate their performance on 23 publicly available datasets.
1 code implementation • 5 Jan 2022 • Sophia Althammer, Sebastian Hofstätter, Mete Sertkan, Suzan Verberne, Allan Hanbury
However in the web domain we are in a setting with large amounts of training data and a query-to-passage or a query-to-document retrieval task.
2 code implementations • 2 Jan 2022 • Sebastian Hofstätter, Sophia Althammer, Mete Sertkan, Allan Hanbury
We present strong Transformer-based re-ranking and dense retrieval baselines for the recently released TripClick health ad-hoc retrieval collection.
1 code implementation • 11 Oct 2021 • Sebastian Hofstätter, Sophia Althammer, Mete Sertkan, Allan Hanbury
We describe our workflow to create an engaging remote learning experience for a university course, while minimizing the post-production time of the educators.
1 code implementation • 9 Aug 2021 • Sophia Althammer, Arian Askari, Suzan Verberne, Allan Hanbury
We address this challenge by combining lexical and dense retrieval methods on the paragraph-level of the cases for the first stage retrieval.
1 code implementation • 10 Jun 2021 • Sophia Althammer, Mark Buckley, Sebastian Hofstätter, Allan Hanbury
Domain-specific contextualized language models have demonstrated substantial effectiveness gains for domain-specific downstream tasks, like similarity matching, entity recognition or information retrieval.
1 code implementation • 20 May 2021 • Sebastian Hofstätter, Bhaskar Mitra, Hamed Zamani, Nick Craswell, Allan Hanbury
An emerging recipe for achieving state-of-the-art effectiveness in neural document re-ranking involves utilizing large pre-trained language models - e. g., BERT - to evaluate all individual passages in the document and then aggregating the outputs by pooling or additional Transformer layers.
4 code implementations • 14 Apr 2021 • Sebastian Hofstätter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, Allan Hanbury
A vital step towards the widespread adoption of neural retrieval models is their resource efficiency throughout the training, indexing and query workflows.
Ranked #15 on
Zero-shot Text Search
on BEIR
1 code implementation • 18 Jan 2021 • Sebastian Hofstätter, Aldo Lipani, Sophia Althammer, Markus Zlabinger, Allan Hanbury
In this work we analyze position bias on datasets, the contextualized representations, and their effect on retrieval results.
1 code implementation • 21 Dec 2020 • Sophia Althammer, Sebastian Hofstätter, Allan Hanbury
For reproducibility and transparency as well as to benefit the community we make our source code and the trained models publicly available.
2 code implementations • Findings of the Association for Computational Linguistics 2020 • Markus Zlabinger, Marta Sabou, Sebastian Hofst{\"a}tter, Allan Hanbury
Obtaining such a corpus from crowdworkers, however, has been shown to be ineffective since (i) workers usually lack domain-specific expertise to conduct the task with sufficient quality, and (ii) the standard approach of annotating entire abstracts of trial reports as one task-instance (i. e. HIT) leads to an uneven distribution in task effort.
1 code implementation • 6 Oct 2020 • Sebastian Hofstätter, Sophia Althammer, Michael Schröder, Mete Sertkan, Allan Hanbury
Based on this finding, we propose a cross-architecture training procedure with a margin focused loss (Margin-MSE), that adapts knowledge distillation to the varying score output distributions of different BERT and non-BERT passage ranking architectures.
1 code implementation • 12 Aug 2020 • Sebastian Hofstätter, Markus Zlabinger, Mete Sertkan, Michael Schröder, Allan Hanbury
We extend the ranked retrieval annotations of the Deep Learning track of TREC 2019 with passage and word level graded relevance annotations for all relevant documents.
1 code implementation • 17 May 2020 • Markus Zlabinger, Marta Sabou, Sebastian Hofstätter, Mete Sertkan, Allan Hanbury
of 0. 68 to experts in DEXA vs. 0. 40 in CONTROL); (ii) already three per majority voting aggregated annotations of the DEXA approach reach substantial agreements to experts of 0. 78/0. 75/0. 69 for P/I/O (in CONTROL 0. 73/0. 58/0. 46).
1 code implementation • 11 May 2020 • Sebastian Hofstätter, Hamed Zamani, Bhaskar Mitra, Nick Craswell, Allan Hanbury
In this work, we propose a local self-attention which considers a moving window over the document terms and for each term attends only to other terms in the same window.
1 code implementation • 4 Feb 2020 • Sebastian Hofstätter, Markus Zlabinger, Allan Hanbury
In addition, to gain insight into TK, we perform a clustered query analysis of TK's results, highlighting its strengths and weaknesses on queries with different types of information need and we show how to interpret the cause of ranking differences of two documents by comparing their internal scores.
no code implementations • 15 Jan 2020 • Markus Zlabinger, Sebastian Hofstätter, Navid Rekabsaz, Allan Hanbury
While existing disease-symptom relationship extraction methods are used as the foundation in the various medical tasks, no collection is available to systematically evaluate the performance of such methods.
1 code implementation • 10 Dec 2019 • Sebastian Hofstätter, Markus Zlabinger, Allan Hanbury
In this paper we look beyond metrics-based evaluation of Information Retrieval systems, to explore the reasons behind ranking results.
1 code implementation • 3 Dec 2019 • Sebastian Hofstätter, Markus Zlabinger, Allan Hanbury
The usage of neural network models puts multiple objectives in conflict with each other: Ideally we would like to create a neural model that is effective, efficient, and interpretable at the same time.
1 code implementation • 30 Jul 2019 • Florian Kromp, Lukas Fischer, Eva Bozsaky, Inge Ambros, Wolfgang Doerr, Sabine Taschner-Mandl, Peter Ambros, Allan Hanbury
In this work, we aim to evaluate the performance of state-of-the-art deep learning architectures to segment nuclei in fluorescence images of various tissue origins and sample preparation types without post-processing.
no code implementations • 10 Jul 2019 • Sebastian Hofstätter, Allan Hanbury
Establishing a docker-based replicability infrastructure offers the community a great opportunity: measuring the run time of information retrieval systems.
1 code implementation • 29 Apr 2019 • Sebastian Hofstätter, Navid Rekabsaz, Carsten Eickhoff, Allan Hanbury
Low-frequency terms are a recurring challenge for information retrieval models, especially neural IR frameworks struggle with adequately capturing infrequently observed words.
no code implementations • 13 Dec 2018 • Navid Rekabsaz, Robert West, James Henderson, Allan Hanbury
The common approach to measuring such biases using a corpus is by calculating the similarities between the embedding vector of a word (like nurse) and the vectors of the representative words of the concepts of interest (such as genders).
no code implementations • 6 Jun 2018 • Lena Maier-Hein, Matthias Eisenmann, Annika Reinke, Sinan Onogur, Marko Stankovic, Patrick Scholz, Tal Arbel, Hrvoje Bogunovic, Andrew P. Bradley, Aaron Carass, Carolin Feldmann, Alejandro F. Frangi, Peter M. Full, Bram van Ginneken, Allan Hanbury, Katrin Honauer, Michal Kozubek, Bennett A. Landman, Keno März, Oskar Maier, Klaus Maier-Hein, Bjoern H. Menze, Henning Müller, Peter F. Neher, Wiro Niessen, Nasir Rajpoot, Gregory C. Sharp, Korsuk Sirinukunwattana, Stefanie Speidel, Christian Stock, Danail Stoyanov, Abdel Aziz Taha, Fons van der Sommen, Ching-Wei Wang, Marc-André Weber, Guoyan Zheng, Pierre Jannin, Annette Kopp-Schneider
International challenges have become the standard for validation of biomedical image analysis methods.
no code implementations • 16 Nov 2017 • Navid Rekabsaz, Mihai Lupu, Allan Hanbury, Andres Duque
We explore the use of unsupervised methods in Cross-Lingual Word Sense Disambiguation (CL-WSD) with the application of English to Persian.
no code implementations • 20 Jul 2017 • Navid Rekabsaz, Bhaskar Mitra, Mihai Lupu, Allan Hanbury
As an alternative, explicit word representations propose vectors whose dimensions are easily interpretable, and recent methods show competitive performance to the dense vectors.
no code implementations • 20 Jun 2016 • Navid Rekabsaz, Mihai Lupu, Allan Hanbury
Word embedding, specially with its recent developments, promises a quantification of the similarity between terms.
1 code implementation • LREC 2016 • Navid Rekabsaz, Serwah Sabetghadam, Mihai Lupu, Linda Andersson, Allan Hanbury
In this paper, we address the shortage of evaluation benchmarks on Persian (Farsi) language by creating and making available a new benchmark for English to Persian Cross Lingual Word Sense Disambiguation (CL-WSD).