Designing molecular structures with desired chemical properties is an essential task in drug discovery and material design.
This kernel leads to the remarkable finding that only the number of leaves at each depth is relevant for the tree architecture in ensemble learning with an infinite number of trees.
We propose a fast non-gradient-based method of rank-1 non-negative matrix factorization (NMF) for missing data, called A1GM, that minimizes the KL divergence from an input matrix to the reconstructed rank-1 matrix.
We present the Additive Poisson Process (APP), a novel framework that can model the higher-order interaction effects of the intensity functions in Poisson processes using projections into lower-dimensional space.
A multiplicative constant scaling factor is often applied to the model output to adjust the dynamics of neural network parameters.
We present a lightweight variant of Boltzmann machines via sample space truncation, called a truncated Boltzmann machine (TBM), which has not been investigated before while can be naturally introduced from the log-linear model viewpoint.
We present a novel perspective on deep learning architectures using a partial order structure, which is naturally incorporated into the information geometric formulation of the log-linear model.
Learning of the model is achieved via convex optimization, thanks to the dually flat statistical manifold generated by the log-linear model.
We present the Additive Poisson Process (APP), a novel framework that can model the higher-order interaction effects of the intensity functions in stochastic processes using lower dimensional projections.
We propose an efficient matrix rank reduction method for non-negative matrices, whose time complexity is quadratic in the number of rows or columns of a matrix.
The appearance of the double-descent risk phenomenon has received growing interest in the machine learning and statistics community, as it challenges well-understood notions behind the U-shaped train-test curves.
We present a novel blind source separation (BSS) method, called information geometric blind source separation (IGBSS).
However, it is well known that increasing the number of parameters also increases the complexity of the model which leads to a bias-variance trade-off.
We present a novel nonnegative tensor decomposition method, called Legendre decomposition, which factorizes an input tensor into a multiplicative combination of parameters.
The search for higher-order feature interactions that are statistically significantly associated with a class variable is of high relevance in fields such as Genetics or Healthcare, but the combinatorial explosion of the candidate space makes this problem extremely challenging in terms of computational efficiency and proper correction for multiple testing.
To theoretically prove the correctness of the algorithm, we model tensors as probability distributions in a statistical manifold and realize tensor balancing as projection onto a submanifold.
The main obstacle of this problem is in the difficulty of taking into account the selection bias, i. e., the bias arising from the fact that patterns are selected from extremely large number of candidates in databases.
Westfall-Young light opens the door to significant pattern mining on large datasets that previously led to prohibitive runtime or memory costs.
Finding statistically significant interactions between binary variables is computationally and statistically challenging in high-dimensional settings, due to the combinatorial explosion in the number of hypotheses.
An open question, however, is whether this strategy of excluding untestable hypotheses also leads to greater statistical power in subgraph mining, in which the number of hypotheses is much larger than in itemset mining.
Distance-based approaches to outlier detection are popular in data mining, as they do not require to model the underlying probability distribution, which is particularly challenging for high-dimensional data.
We present SConES, a new efficient method to discover sets of genetic loci that are maximally associated with a phenotype, while being connected in an underlying network.