Feature selection is known as an effective approach to overcome computational complexity and information redundancy in high-dimensional data classification and clustering. Selecting best features in unsupervised learning is much harder than supervised learning because we do not have the labels of data that can guide selection algorithms to remove irrelevant and redundant features. In this paper, we propose a new approach for unsupervised feature selection based on Genetic Algorithm as a heuristic search approach and combine it with Fuzzy C-Means algorithm. We propose a dual, multi objective fitness function based on Davies-Bouldin (DB) and Calinski-Harabasz (CH) indexes. We show that these indices do not necessarily have similar behaviors. Thus, rather than simply considering their weighted average as a new fitness function, we propose a new approach to aggregate them based on their tradeoffs. Comparison of the proposed approach with popular feature selection algorithms, across different datasets, indicates the outperformance of the proposed approach for feature selection.
Amiri Souri, Elmira, Azadeh Mohebi, and Abbas Ahmadi. 2017. Genetic algorithm and fuzzy C-means for feature selection: Based on a dual fitness function. Article Presented at the International Symposium on Artificial Intelligence and Signal Processing (AISP), Shiraz.