1 code implementation • EMNLP (Eval4NLP) 2020 • Jesper Brink Andersen, Mikkel Bak Bertelsen, Mikkel Hørby Schou, Manuel R. Ciosici, Ira Assent
The data set is expanded to contain semantic and syntactic tests and is multilingual (English, German, and Italian).
no code implementations • 11 Oct 2024 • Rafael Pablos Sarabia, Joachim Nyborg, Morten Birk, Jeppe Liborius Sjørup, Anders Lillevang Vesterholt, Ira Assent
Precipitation nowcasting is crucial across various industries and plays a significant role in mitigating and adapting to climate change.
1 code implementation • 31 Jul 2024 • Anna Beer, Martin Heinrigs, Claudia Plant, Ira Assent
We introduce MOSCITO (MOlecular Dynamics Subspace Clustering with Temporal Observance), a subspace clustering for molecular dynamics data.
no code implementations • 25 Jun 2024 • Amalie Brogaard Pauli, Isabelle Augenstein, Ira Assent
As opposed to prior work which focuses on particular domains or types of persuasion, we conduct a general study across various domains to measure and benchmark to what degree LLMs produce persuasive language - both when explicitly instructed to rewrite text to be more or less persuasive and when only instructed to paraphrase.
1 code implementation • 12 Jun 2024 • Saeedeh Javadi, Atefeh Moradan, Mohammad Sorkhpar, Klim Zaporojets, Davide Mottin, Ira Assent
Additionally, WikES features a dataset generator to test entity summarization algorithms in different areas of the knowledge graph.
1 code implementation • 30 Nov 2023 • Rafael Pablos Sarabia, Joachim Nyborg, Morten Birk, Ira Assent
This paper presents a solution to the Weather4Cast 2023 competition, where the goal is to forecast high-resolution precipitation with an 8-hour lead time using lower-resolution satellite radiance images.
1 code implementation • 12 May 2023 • Andrew Draganov, Jakob Rødsgaard Jørgensen, Katrine Scheel Nellemann, Davide Mottin, Ira Assent, Tyrus Berry, Cigdem Aslay
tSNE and UMAP are popular dimensionality reduction algorithms due to their speed and interpretable low-dimensional embeddings.
1 code implementation • 20 Jun 2022 • Andrew Draganov, Tyrus Berry, Jakob Rødsgaard Jørgensen, Katrine Scheel Nellemann, Ira Assent, Davide Mottin
In this work, we show that this is indeed possible by combining the two approaches into a single method.
1 code implementation • 17 Mar 2022 • Joachim Nyborg, Charlotte Pelletier, Ira Assent
Unlike previous positional encoding based on calendar time (e. g. day-of-year), TPE is based on thermal time, which is obtained by accumulating daily average temperatures over the growing season.
1 code implementation • 23 Nov 2021 • Joachim Nyborg, Ira Assent
Convolutional neural networks (CNNs) have greatly advanced the state-of-the-art in the detection of clouds in satellite images, but existing CNN-based methods are costly as they require large amounts of training images with expensive pixel-level cloud labels.
1 code implementation • 4 Nov 2021 • Joachim Nyborg, Charlotte Pelletier, Sébastien Lefèvre, Ira Assent
However, when applied to target regions spatially different from the training region, these models perform poorly without any target labels due to the temporal shift of crop phenology between regions.
1 code implementation • 30 Oct 2021 • Frederik Hvilshøj, Alexandros Iosifidis, Ira Assent
As counterfactual examples become increasingly popular for explaining decisions of deep learning models, it is essential to understand what properties quantitative evaluation metrics do capture and equally important what they do not capture.
no code implementations • 3 May 2021 • Simon Enni, Ira Assent
The influence of machine learning (ML) is quickly spreading, and a number of recent technological innovations have applied ML as a central technology.
1 code implementation • 25 Mar 2021 • Frederik Hvilshøj, Alexandros Iosifidis, Ira Assent
Counterfactual examples identify how inputs can be altered to change the predicted class of a classifier, thus opening up the black-box nature of, e. g., deep neural networks.
1 code implementation • EACL 2021 • Mads Toftrup, Søren Asger Sørensen, Manuel R. Ciosici, Ira Assent
Language Identification is the task of identifying a document's language.
Ranked #1 on Language Identification on OpenSubtitles
1 code implementation • LREC 2020 • Manuel R. Ciosici, Ira Assent, Leon Derczynski
We present efficient implementations of Brown clustering and the alternative Exchange clustering as well as a number of methods to accelerate the computation of both hierarchical and flat clusters.
no code implementations • LREC 2020 • Jan Neerbek, Morten Eskildsen, Peter Dolog, Ira Assent
In this work we present a corpus for the evaluation of sensitive information detection approaches that addresses the need for real world sensitive information for empirical studies.
no code implementations • 4 Dec 2019 • Holger Trittenbach, Klemens Böhm, Ira Assent
Existing methods to estimate hyperparameter values are purely heuristic, and the conditions under which they work well are unclear.
no code implementations • NAACL 2019 • Manuel R. Ciosici, Ira Assent
We present Abbreviation Explorer, a system that supports interactive exploration of abbreviations that are challenging for Unsupervised Abbreviation Disambiguation (UAD).
no code implementations • NAACL 2019 • Manuel R. Ciosici, Leon Derczynski, Ira Assent
We show that increases in Average Mutual Information, the clustering algorithms{'} optimization goal, are highly correlated with improvements in encoding of morphosyntactic information.
no code implementations • 1 Apr 2019 • Manuel Ciosici, Tobias Sommer, Ira Assent
In this paper, we present an entirely unsupervised abbreviation disambiguation method (called UAD) that picks up abbreviation definitions from unstructured text.
no code implementations • COLING 2018 • Manuel R. Ciosici, Ira Assent
Abbreviations and acronyms are a part of textual communication in most domains.
no code implementations • 29 Jul 2015 • Barbora Micenková, Brian McWilliams, Ira Assent
We demonstrate the good performance of BORE compared to a variety of competing methods in the non-budgeted and the budgeted outlier detection problem on 12 real-world datasets.