no code implementations • 8 Jan 2025 • Petr Knoth, Laurent Romary, Patrice Lopez, Roberto Di Cosmo, Pavel Smrz, Tomasz Umerle, Melissa Harrison, Alain Monteil, Matteo Cancellieri, David Pride
A key issue hindering discoverability, attribution and reusability of open research software is that its existence often remains hidden within the manuscript of research papers.
no code implementations • 23 Dec 2024 • Martin Fajcik, Martin Docekal, Jan Dolezal, Karel Ondrej, Karel Beneš, Jan Kapsa, Pavel Smrz, Alexander Polok, Michal Hradis, Zuzana Neverilova, Ales Horak, Radoslav Sabol, Michal Stefanik, Adam Jirkovsky, David Adamczyk, Petr Hyner, Jan Hula, Hynek Kydlicek
Furthermore, we collect and clean BUT-Large Czech Collection, the largest publicly available clean Czech language corpus, and use it for (i) contamination analysis, (ii) continuous pretraining of the first Czech-centric 7B language model, with Czech-specific tokenization.
2 code implementations • 3 May 2024 • Martin Docekal, Martin Fajcik, Pavel Smrz
We show that the estimated upper bound for extractive summarization increases by 217% in the ROUGE-2 score, when using full content instead of abstracts.
1 code implementation • 8 Sep 2022 • Sergio Burdisso, Juan Zuluaga-Gomez, Esau Villatoro-Tello, Martin Fajcik, Muskaan Singh, Pavel Smrz, Petr Motlicek
In this paper, we describe our participation in the subtask 1 of CASE-2022, Event Causality Identification with Casual News Corpus.
1 code implementation • 8 Sep 2022 • Martin Fajcik, Muskaan Singh, Juan Zuluaga-Gomez, Esaú Villatoro-Tello, Sergio Burdisso, Petr Motlicek, Pavel Smrz
In this paper, we describe our shared task submissions for Subtask 2 in CASE-2022, Event Causality Identification with Casual News Corpus.
1 code implementation • 28 Jul 2022 • Martin Fajcik, Petr Motlicek, Pavel Smrz
We propose to disentangle the per-evidence relevance probability and its contribution to the final veracity probability in an interpretable way -- the final veracity probability is proportional to a linear ensemble of per-evidence relevance probabilities.
1 code implementation • 11 May 2022 • Martin Docekal, Pavel Smrz
Transformer-based architectures in natural language processing force input size limits that can be problematic when long documents need to be processed.
1 code implementation • Findings (EMNLP) 2021 • Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz
This work presents a novel four-stage open-domain QA pipeline R2-D2 (Rank twice, reaD twice).
Ranked #2 on
Open-Domain Question Answering
on Natural Questions
2 code implementations • 21 Feb 2021 • Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz
This work presents a novel pipeline that demonstrates what is achievable with a combined effort of state-of-the-art approaches.
no code implementations • 1 Jan 2021 • Sewon Min, Jordan Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini, Nicola De Cao, Edouard Grave, Ikuya Yamada, Sonse Shimaoka, Masatoshi Suzuki, Shumpei Miyawaki, Shun Sato, Ryo Takahashi, Jun Suzuki, Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz, Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Wen-tau Yih
We review the EfficientQA competition from NeurIPS 2020.
1 code implementation • EMNLP (MRQA) 2021 • Martin Fajcik, Josef Jon, Pavel Smrz
Therefore we propose multiple approaches to modelling joint probability $P(a_s, a_e)$ directly.
no code implementations • SEMEVAL 2020 • Martin Docekal, Martin Fajcik, Josef Jon, Pavel Smrz
This paper describes our system that was designed for Humor evaluation within the SemEval-2020 Task 7.
1 code implementation • SEMEVAL 2020 • Martin Fajcik, Josef Jon, Martin Docekal, Pavel Smrz
This paper describes BUT-FIT's submission at SemEval-2020 Task 5: Modelling Causal Reasoning in Language: Detecting Counterfactuals.
1 code implementation • SEMEVAL 2019 • Martin Fajcik, Lukáš Burget, Pavel Smrz
This paper describes our system submitted to SemEval 2019 Task 7: RumourEval 2019: Determining Rumour Veracity and Support for Rumours, Subtask A (Gorrell et al., 2019).
no code implementations • LREC 2016 • Lubomir Otrusina, Pavel Smrz
This paper introduces the Web TextFull linkage to Linked Open Data (WTF-LOD) dataset intended for large-scale evaluation of named entity recognition (NER) systems.
no code implementations • LREC 2014 • Pavel Smrz, Jan Kouril
This paper deals with information retrieval on semantically enriched web-scale document collections.