Optimizing Clustering Models Using Principle Component Analysis for Car Customers
Agnes Riska Savira(1*)
(1) University of Buana Perjuangan Karawang, Indonesia
(*) Corresponding Author
Abstract
In the competitive business world, companies strategically utilize customer data to achieve goals, requiring a comprehensive understanding of various customer traits, behaviors and needs. Customer segmentation, an important strategy, requires grouping individuals based on various characteristics. The K-Means algorithm is widely used for customer data grouping connectivity because of its ease of implementation in Machine Learning. However, challenges arise in high-dimensional data, prompting the need for dimensionality reduction. Principal Component Analysis (PCA) is emerging as an effective method for data communication while minimizing information loss. Previous research emphasizes the success of PCA in improving analysis and clustering efficiency. This research contributes by integrating PCA into K-Means clustering to analyze customer segments in a car company. This empowers companies to attract new customers, implement targeted marketing, understand customer-company relationships, and increase expected profitability. PCA, which preserves 75% of the variation with 3 principal components, precedes the implementation of K-Means after normalization. Evaluation using the Elbow and Silhouette Score Method identified eight optimal clusters. The post-PCA K-Means model with optimal cluster selection produces a Silhouette Score of 0.7789.
Keywords
Full Text:
PDFReferences
N. H. Harani, C. Prianto, and F. A. Nugraha, “Segmentasi Pelanggan Produk Digital Service Indihome Menggunakan Algoritma K-Means Berbasis Python,” J. Manaj. Inform., vol. 10, no. 2, pp. 133–146, 2020, doi: 10.34010/jamika.v10i2.2683.
A. T. Widiyanto and A. Witanti, “Segmentasi Pelanggan Berdasarkan Analisis RFM Menggunakan Algoritma K-Means Sebagai Dasar Strategi Pemasaran (Studi Kasus PT Coversuper Indonesia Global),” KONSTELASI Konvergensi Teknol. dan Sist. Inf., vol. 1, no. 1, pp. 204–215, 2021, doi: 10.24002/konstelasi.v1i1.4293.
A. Abdulhafedh, “Incorporating K-means, Hierarchical Clustering and PCA in Customer Segmentation,” J. City Dev., vol. 3, no. 1, pp. 12–30, 2021, doi: 10.12691/jcd-3-1-3.
D. Hediyati and I. M. Suartana, “Penerapan Principal Component Analysis (PCA) Untuk Reduksi Dimensi Pada Proses Clustering Data Produksi Pertanian Di Kabupaten Bojonegoro,” J. Inf. Eng. Educ. Technol., vol. 5, no. 2, pp. 49–54, 2021, doi: 10.26740/jieet.v5n2.p49-54.
M. Harahap, Y. Lubis, and Z. Situmorang, “Analisis Pemasaran Bisnis dengan Data Science : Segmentasi Kepribadian Pelanggan berdasarkan Algoritma K-Means Clustering,” Data Sci. Indones., vol. 1, no. 2, pp. 76–88, 2022, doi: 10.47709/dsi.v1i2.1348.
S. Dwididanti and D. A. Anggoro, “Analisis Perbandingan Algoritma Bisecting K-Means dan Fuzzy C-Means pada Data Pengguna Kartu Kredit,” Emit. J. Tek. Elektro, vol. 22, no. 2, pp. 110–117, 2022, doi: 10.23917/emitor.v22i2.15677.
N. Khairu Nissa, Y. Nugraha, C. F. Finola, A. Ernesto, J. I. Kanggrawan, and A. L. Suherman, “Evaluasi Berbasis Data: Kebijakan Pembatasan Mobilitas Publik dalam Mitigasi Persebaran COVID-19 di Jakarta,” J. Sist. Cerdas, vol. 3, no. 2, pp. 84–94, 2020, doi: 10.37396/jsc.v3i2.77.
N. Y. Aswad, “Clustering Algoritma K-Means Pengadaan Barang Non Medis Di Rumah Sakit Jantung Hasna Medika Cirebon,” J. Data Sci. dan Inform., vol. 2, no. 1, pp. 6–14, 2022.
A. Yudhistira and R. Andika, “Pengelompokan Data Nilai Siswa Menggunakan Metode K-Means Clustering,” J. Artif. Intell. Technol. Inf., vol. 1, no. 1, pp. 20–28, 2023, doi: 10.58602/jaiti.v1i1.22.
T. Tommy and A. M. Husein, “Model Prediksi Prestasi Mahasiswa Berdasarkan Evaluasi Pembelajaran Menggunakan Pendekatan Data Science,” Data Sci. Indones., vol. 1, no. 1, pp. 14–20, 2021, doi: 10.47709/dsi.v1i1.1168.
T. F. Johnson, N. J. B. Isaac, A. Paviolo, and M. González-Suárez, “Handling missing values in trait data,” Glob. Ecol. Biogeogr., vol. 30, no. 1, pp. 51–62, 2021, doi: 10.1111/geb.13185.
P. Arsi, R. Wahyudi, and R. Waluyo, “Optimasi SVM Berbasis PSO pada Analisis Sentimen Wacana Pindah Ibu Kota Indonesia,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 2, pp. 231–237, 2021, doi: 10.29207/resti.v5i2.2698.
T. Nyitrai and M. Virág, “The effects of handling outliers on the performance of bankruptcy prediction models,” Socioecon. Plann. Sci., vol. 67, no. August, pp. 34–42, 2019, doi: 10.1016/j.seps.2018.08.004.
E. P. Cynthia and E. Ismanto, “Metode Decision Tree Algoritma C.45 Dalam Mengklasifikasi Data Penjualan Bisnis Gerai Makanan Cepat Saji,” Jurasik (Jurnal Ris. Sist. Inf. dan Tek. Inform., vol. 3, no. July, p. 1, 2018, doi: 10.30645/jurasik.v3i0.60.
A. S. Ritonga and I. Muhandhis, “Teknik Data Mining Untuk Mengklasifikasikan Data Ulasan Destinasi Wisata Menggunakan Reduksi Data Principal Component Analysis (Pca),” Edutic - Sci. J. Informatics Educ., vol. 7, no. 2, 2021, doi: 10.21107/edutic.v7i2.9247.
A. Sulistiyawati and E. Supriyanto, “Implementasi Algoritma K-means Clustring dalam Penetuan Siswa Kelas Unggulan,” J. Tekno Kompak, vol. 15, no. 2, p. 25, 2021, doi: 10.33365/jtk.v15i2.1162.
K. D. Ramgude and N. R. Rajhans, “K-means clustering for optimization of spare parts delivery,” Manag. Sci. Lett., vol. 13, no. 4, pp. 235–240, 2023, doi: 10.5267/j.msl.2023.6.004.
K. R. Shahapure and C. Nicholas, “Cluster quality analysis using silhouette score,” Proc. - 2020 IEEE 7th Int. Conf. Data Sci. Adv. Anal. DSAA 2020, pp. 747–748, 2020, doi: 10.1109/DSAA49011.2020.00096.
DOI: https://doi.org/10.22146/ijccs.94744
Article Metrics
Abstract views : 1705 | views : 1260Refbacks
- There are currently no refbacks.
Copyright (c) 2024 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
View My Stats1