Search Results for author: Nikolaj Tatti

Found 25 papers, 2 papers with code

Skopus: Mining top-k sequential patterns under leverage

1 code implementation26 Jun 2015 Francois Petitjean, Tao Li, Nikolaj Tatti, Geoffrey I. Webb

It combines (1) a novel definition of the expected support for a sequential pattern - a concept on which most interestingness measures directly rely - with (2) SkOPUS: a new branch-and-bound algorithm for the exact discovery of top-k sequential patterns under a given measure of interest.

A note on adjusting $R^2$ for using with cross-validation

no code implementations5 May 2016 Indre Zliobaite, Nikolaj Tatti

We show how to adjust the coefficient of determination ($R^2$) when used for measuring predictive accuracy via leave-one-out cross-validation.

Balancing information exposure in social networks

no code implementations NeurIPS 2017 Kiran Garimella, Aristides Gionis, Nikos Parotsidis, Nikolaj Tatti

Our goal is to find two sets of nodes to employ in the respective campaigns, so that the overall information exposure for the two campaigns is balanced.

Strongly polynomial efficient approximation scheme for segmentation

no code implementations28 May 2018 Nikolaj Tatti

In addition, we consider a cumulative version of Seg, where we are asked to discover the optimal segmentation for each prefix of the input sequence.

Segmentation

Boolean matrix factorization meets consecutive ones property

no code implementations17 Jan 2019 Nikolaj Tatti, Pauli Miettinen

In this paper, we study a problem of Boolean matrix factorization where we additionally require that the factor matrices have consecutive ones property (OBMF).

Efficient estimation of AUC in a sliding window

no code implementations2 Feb 2019 Nikolaj Tatti

More specifically, we propose an algorithm that, given $\epsilon$, estimates AUC within $\epsilon / 2$, and can maintain this estimate in $O((\log k) / \epsilon)$ time, per update, as the window slides.

What is the dimension of your binary data?

no code implementations4 Feb 2019 Nikolaj Tatti, Taneli Mielikainen, Aristides Gionis, Heikki Mannila

Defining the effective dimensionality of such a dataset is a nontrivial problem.

Clustering

The Long and the Short of It: Summarising Event Sequences with Serial Episodes

no code implementations7 Feb 2019 Nikolaj Tatti, Jilles Vreeken

An ideal outcome of pattern mining is a small set of informative patterns, containing no redundancy or noise, that identifies the key structure of the data at hand.

Using Background Knowledge to Rank Itemsets

no code implementations8 Feb 2019 Nikolaj Tatti, Michael Mampaey

We demonstrate that these statistics describe forms of data that occur in practice and have been studied in data mining.

Comparing Apples and Oranges: Measuring Differences between Exploratory Data Mining Results

no code implementations18 Feb 2019 Nikolaj Tatti, Jilles Vreeken

Our approach provides a means to study and tell differences between results of different exploratory data mining methods.

Finding Robust Itemsets Under Subsampling

no code implementations18 Feb 2019 Nikolaj Tatti, Fabian Moerchen, Toon Calders

We do this by measuring the robustness of a property of an itemset such as closedness or non-derivability.

Discovering Bands from Graphs

no code implementations9 Apr 2019 Nikolaj Tatti

To this end, we formulate an optimization problem: given a graph and an integer $K$, we want to order graph vertices and partition the ordered adjacency matrix into $K$ bands such that bands closer to the diagonal are more dense.

Graph Mining

Mining Closed Strict Episodes

no code implementations14 Apr 2019 Nikolaj Tatti, Boris Cule

Adopting existing approaches for discovering traditional patterns, such as closed itemsets, to episodes is not straightforward.

Discovering Episodes with Compact Minimal Windows

no code implementations15 Apr 2019 Nikolaj Tatti

Discovering the most interesting patterns is the key problem in the field of pattern mining.

Mining Closed Episodes with Simultaneous Events

no code implementations16 Apr 2019 Nikolaj Tatti, Boris Cule

Episodes are sequential patterns describing events that often occur in the vicinity of each other.

Maximum Entropy Based Significance of Itemsets

no code implementations24 Apr 2019 Nikolaj Tatti

We say that the itemset is significant if we are surprised by its frequency when compared to the frequencies of its sub-itemsets.

Summarizing Data Succinctly with the Most Informative Itemsets

no code implementations25 Apr 2019 Michael Mampaey, Jilles Vreeken, Nikolaj Tatti

As we use the Maximum Entropy principle to obtain unbiased probabilistic models, and only include those itemsets that are most informative with regard to the current model, the summaries we construct are guaranteed to be both descriptive and non-redundant.

Descriptive

Decomposable Families of Itemsets

no code implementations16 Jun 2020 Nikolaj Tatti, Hannes Heikinheimo

Such itemset families define a probabilistic model for the data from which the original collection of itemsets has been derived from.

Tell Me Something I Don't Know: Randomization Strategies for Iterative Data Mining

no code implementations16 Jun 2020 Sami Hanhijärvi, Markus Ojala, Niko Vuokko, Kai Puolamäki, Nikolaj Tatti, Heikki Mannila

At each step in the data mining process, the randomization produces random samples from the set of data matrices satisfying the already discovered patterns or models.

Clustering

Approximation algorithms for confidence bands for time series

no code implementations12 Dec 2021 Nikolaj Tatti

If we have a constraint $k$ for which we cannot find appropriate $\alpha$, we demonstrate a simple algorithm that yields $O(\sqrt{n})$ approximation guarantee by connecting the problem to a minimum $k$-union problem.

Time Series Time Series Analysis

Maintaining AUC and $H$-measure over time

no code implementations12 Dec 2021 Nikolaj Tatti

The first algorithm maintains area under the ROC curve (AUC) under addition and deletion of data points in $O(\log n)$ time.

Ranking with submodular functions on a budget

1 code implementation8 Apr 2022 Guangyi Zhang, Nikolaj Tatti, Aristides Gionis

In more detail, given a set of items and a set of non-decreasing submodular functions, where each function is associated with a budget, we aim to find a ranking of the set of items that maximizes the sum of values achieved by all functions under the budget constraints.

Marketing

Recurrent segmentation meets block models in temporal networks

no code implementations19 May 2022 Chamalee Wickrama Arachchi, Nikolaj Tatti

We extend this model to temporal networks by modelling the edges with a Poisson process.

Stochastic Block Model

Fast likelihood-based change point detection

no code implementations21 Jan 2023 Nikolaj Tatti

We then show that for fixed bernoulli parameters we can find the optimal change point in logarithmic time.

Change Detection Change Point Detection

Jaccard-constrained dense subgraph discovery

no code implementations30 Aug 2023 Chamalee Wickrama Arachchi, Nikolaj Tatti

Finding dense subgraphs is a core problem in graph mining with many applications in diverse domains.

Graph Mining

Cannot find the paper you are looking for? You can Submit a new open access paper.