Feature Selection for Gender Classification in TUIK Life Satisfaction Survey

12 Jul 2018  ·  Adil Çoban, Ilhan Tarımer ·

As known, attribute selection is a method that is used before the classification of data mining. In this study, a new data set has been created by using attributes expressing overall satisfaction in Turkey Statistical Institute (TSI) Life Satisfaction Survey dataset. Attributes are sorted by Ranking search method using attribute selection algorithms in a data mining application. These selected attributes were subjected to a classification test with Naive Bayes and Random Forest from machine learning algorithms. The feature selection algorithms are compared according to the number of attributes selected and the classification accuracy rates achievable with them. In this study, which is aimed at reducing the dataset volume, the best classification result comes up with 3 attributes selected by the Chi2 algorithm. The best classification rate was 73% with the Random Forest classification algorithm.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here