Genetic Programming has been very successful in solving a large area of
problems but its use as a machine learning algorithm has been limited so far. One of the reasons is the problem of overfitting which cannot be solved or
suppresed as easily as in more traditional approaches...
Another problem, closely
related to overfitting, is the selection of the final model from the
population. In this article we present our research that addresses both problems:
overfitting and model selection. We compare several ways of dealing with
ovefitting, based on Random Sampling Technique (RST) and on using a validation
set, all with an emphasis on model selection. We subject each approach to a
thorough testing on artificial and real--world datasets and compare them with
the standard approach, which uses the full training data, as a baseline.