A Novel Approach for Single Gene Selection Using Clustering and Dimensionality Reduction

10 Jun 2013  ·  E. N. Sathishkumar, K. Thangavel, T. Chandrasekhar ·

We extend the standard rough set-based approach to deal with huge amounts of numeric attributes versus small amount of available objects. Here, a novel approach of clustering along with dimensionality reduction; Hybrid Fuzzy C Means-Quick Reduct (FCMQR) algorithm is proposed for single gene selection. Gene selection is a process to select genes which are more informative. It is one of the important steps in knowledge discovery. The problem is that all genes are not important in gene expression data. Some of the genes may be redundant, and others may be irrelevant and noisy. In this study, the entire dataset is divided in proper grouping of similar genes by applying Fuzzy C Means (FCM) algorithm. A high class discriminated genes has been selected based on their degree of dependence by applying Quick Reduct algorithm based on Rough Set Theory to all the resultant clusters. Average Correlation Value (ACV) is calculated for the high class discriminated genes. The clusters which have the ACV value a s 1 is determined as significant clusters, whose classification accuracy will be equal or high when comparing to the accuracy of the entire dataset. The proposed algorithm is evaluated using WEKA classifiers and compared. Finally, experimental results related to the leukemia cancer data confirm that our approach is quite promising, though it surely requires further research.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here