An artificial neural network to find correlation patterns in an arbitrary number of variables

21 Jun 2016  ·  Alessandro Fontana ·

Methods to find correlation among variables are of interest to many disciplines, including statistics, machine learning, (big) data mining and neurosciences. Parameters that measure correlation between two variables are of limited utility when used with multiple variables. In this work, I propose a simple criterion to measure correlation among an arbitrary number of variables, based on a data set. The central idea is to i) design a function of the variables that can take different forms depending on a set of parameters, ii) calculate the difference between a statistics associated to the function computed on the data set and the same statistics computed on a randomised version of the data set, called "scrambled" data set, and iii) optimise the parameters to maximise this difference. Many such functions can be organised in layers, which can in turn be stacked one on top of the other, forming a neural network. The function parameters are searched with an enhanced genetic algortihm called POET and the resulting method is tested on a cancer gene data set. The method may have potential implications for some issues that affect the field of neural networks, such as overfitting, the need to process huge amounts of data for training and the presence of "adversarial examples".

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here