no code implementations • 25 Apr 2024 • Atsushi Miyauchi, Florian Adriaens, Francesco Bonchi, Nikolaj Tatti

In this paper, we establish Multilayer Correlation Clustering, a novel generalization of Correlation Clustering (Bansal et al., FOCS '02) to the multilayer setting.

no code implementations • 30 Aug 2023 • Chamalee Wickrama Arachchi, Nikolaj Tatti

Finding dense subgraphs is a core problem in graph mining with many applications in diverse domains.

no code implementations • 21 Jan 2023 • Nikolaj Tatti

We then show that for fixed bernoulli parameters we can find the optimal change point in logarithmic time.

no code implementations • 19 May 2022 • Chamalee Wickrama Arachchi, Nikolaj Tatti

We extend this model to temporal networks by modelling the edges with a Poisson process.

1 code implementation • 8 Apr 2022 • Guangyi Zhang, Nikolaj Tatti, Aristides Gionis

In more detail, given a set of items and a set of non-decreasing submodular functions, where each function is associated with a budget, we aim to find a ranking of the set of items that maximizes the sum of values achieved by all functions under the budget constraints.

no code implementations • 12 Dec 2021 • Nikolaj Tatti

If we have a constraint $k$ for which we cannot find appropriate $\alpha$, we demonstrate a simple algorithm that yields $O(\sqrt{n})$ approximation guarantee by connecting the problem to a minimum $k$-union problem.

no code implementations • 12 Dec 2021 • Nikolaj Tatti

The first algorithm maintains area under the ROC curve (AUC) under addition and deletion of data points in $O(\log n)$ time.

no code implementations • 16 Jun 2020 • Sami Hanhijärvi, Markus Ojala, Niko Vuokko, Kai Puolamäki, Nikolaj Tatti, Heikki Mannila

At each step in the data mining process, the randomization produces random samples from the set of data matrices satisfying the already discovered patterns or models.

no code implementations • 16 Jun 2020 • Nikolaj Tatti, Hannes Heikinheimo

Such itemset families define a probabilistic model for the data from which the original collection of itemsets has been derived from.

no code implementations • 25 Apr 2019 • Michael Mampaey, Jilles Vreeken, Nikolaj Tatti

As we use the Maximum Entropy principle to obtain unbiased probabilistic models, and only include those itemsets that are most informative with regard to the current model, the summaries we construct are guaranteed to be both descriptive and non-redundant.

no code implementations • 24 Apr 2019 • Nikolaj Tatti

We say that the itemset is significant if we are surprised by its frequency when compared to the frequencies of its sub-itemsets.

no code implementations • 16 Apr 2019 • Nikolaj Tatti, Boris Cule

Episodes are sequential patterns describing events that often occur in the vicinity of each other.

no code implementations • 15 Apr 2019 • Nikolaj Tatti

Discovering the most interesting patterns is the key problem in the field of pattern mining.

no code implementations • 14 Apr 2019 • Nikolaj Tatti, Boris Cule

Adopting existing approaches for discovering traditional patterns, such as closed itemsets, to episodes is not straightforward.

no code implementations • 9 Apr 2019 • Nikolaj Tatti

To this end, we formulate an optimization problem: given a graph and an integer $K$, we want to order graph vertices and partition the ordered adjacency matrix into $K$ bands such that bands closer to the diagonal are more dense.

no code implementations • 18 Feb 2019 • Nikolaj Tatti, Jilles Vreeken

Our approach provides a means to study and tell differences between results of different exploratory data mining methods.

no code implementations • 18 Feb 2019 • Nikolaj Tatti, Fabian Moerchen, Toon Calders

We do this by measuring the robustness of a property of an itemset such as closedness or non-derivability.

no code implementations • 8 Feb 2019 • Nikolaj Tatti, Michael Mampaey

We demonstrate that these statistics describe forms of data that occur in practice and have been studied in data mining.

no code implementations • 7 Feb 2019 • Nikolaj Tatti, Jilles Vreeken

An ideal outcome of pattern mining is a small set of informative patterns, containing no redundancy or noise, that identifies the key structure of the data at hand.

no code implementations • 4 Feb 2019 • Nikolaj Tatti, Taneli Mielikainen, Aristides Gionis, Heikki Mannila

Defining the effective dimensionality of such a dataset is a nontrivial problem.

no code implementations • 2 Feb 2019 • Nikolaj Tatti

More specifically, we propose an algorithm that, given $\epsilon$, estimates AUC within $\epsilon / 2$, and can maintain this estimate in $O((\log k) / \epsilon)$ time, per update, as the window slides.

no code implementations • 17 Jan 2019 • Nikolaj Tatti, Pauli Miettinen

In this paper, we study a problem of Boolean matrix factorization where we additionally require that the factor matrices have consecutive ones property (OBMF).

no code implementations • 28 May 2018 • Nikolaj Tatti

In addition, we consider a cumulative version of Seg, where we are asked to discover the optimal segmentation for each prefix of the input sequence.

no code implementations • NeurIPS 2017 • Kiran Garimella, Aristides Gionis, Nikos Parotsidis, Nikolaj Tatti

Our goal is to find two sets of nodes to employ in the respective campaigns, so that the overall information exposure for the two campaigns is balanced.

no code implementations • 5 May 2016 • Indre Zliobaite, Nikolaj Tatti

We show how to adjust the coefficient of determination ($R^2$) when used for measuring predictive accuracy via leave-one-out cross-validation.

1 code implementation • 26 Jun 2015 • Francois Petitjean, Tao Li, Nikolaj Tatti, Geoffrey I. Webb

It combines (1) a novel definition of the expected support for a sequential pattern - a concept on which most interestingness measures directly rely - with (2) SkOPUS: a new branch-and-bound algorithm for the exact discovery of top-k sequential patterns under a given measure of interest.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.