The Influence of Optimization of the k-Means Algorithm with Genetic Algorithm on the Results of High Dimension Data Clustering

Yulinda Ramadhana; Muhammad Ihsan Jambak

doi:10.33022/ijcs.v13i1.3634

Authors

Yulinda Ramadhana Universitas Sriwijaya
Muhammad Ihsan Jambak Universitas Sriwijaya

DOI:

https://doi.org/10.33022/ijcs.v13i1.3634

Keywords:

Dimensional Reduction, Feature Selection, Singular Value Decomposition, k-Means, Genetics Algorithm

Abstract

Clustering k-means begins with the random initial determination of the centroid. Initially generated random centroids often cause k-means to be trapped in the optimum local solution, which results in poor clustering quality. Therefore, this study examined the effect of genetic algorithms in determining initial centroids in k-means. Clustering k-means with random initial centroids and with initial centroids from genetic algorithm calculations are each tested on the data with dimension reduction and without dimension reduction. Based on the results of the initial centroid testing obtained from genetic algorithms, the quality of cluster results increased by 54.9% in the high dimensional data and 52.4% in the data that had been carried out for the dimensional reduction. This result shows that the k-means clustering with initial centroids obtained from genetic algorithm calculations has the best cluster/solution results with significant results.