Comparison of the Performance of Random Forest and K-Nearest Neighbor in Classifying Leukemia Using Principal Component Analysis
DOI:
https://doi.org/10.32736/sisfokom.v13i2.2165Keywords:
Leukimia, Microarray Data, PCA, Random Forest, KNNAbstract
Leukemia is the most common blood cancer in Asia, one of which is Indonesia. Leukemia can affect blood cells, bone marrow, lymph nodes and other parts of the lymphatic system. One way to detect leukemia is to use microarray technology by applying gene expression. Microarrays have a very large number of genes so it is necessary to reduce the number of genes in order to eliminate irrelevant features and increase the accuracy of the classification process. The leukemia feature/gene reduction process was carried out using PCA and the classification process was carried out using RF and KNN. The accuracy results from the RF classification method using 100 n_estimators were 78.57%, while using the KNN method the accuracy results with K=1 were 78.57%, K=3 and 5 were 85.71%, and K=7 and 9 were 71.42%. The best accuracy results use KNN with K=3 and 5.References
D. Prasetya, “Leukemia, Penyakit Kanker Darah, yuk simak selengkapnya,” Hermina Pateur, 2023. https://herminahospitals.com/id/articles/leukemia-penyakit-kanker-darah-yuk-simak-selengkapnya (accessed Jan. 20, 2024).
V. Rupapara, F. Rustam, W. Aljedaani, H. F. Shahzad, E. Lee, and I. Ashraf, “Blood cancer prediction using leukemia microarray gene data and hybrid logistic vector trees model,” Sci. Rep., vol. 12, no. 1, pp. 1–15, 2022, doi: 10.1038/s41598-022-04835-6.
B. Pradana and A. Aditsania, “Implementasi Minimum Redudancy Maksimum Relevance ( MRMR ) dan Genetic Algorithm ( GA ) untuk Reduksi Dimensi pada Klasifikasi Data Micorarray Menggunakan Functional Link Neural Network ( FLNN ),” vol. 6, no. 2, pp. 8966–8977, 2019.
I. G. N. P. V. Geramona and W. Astuti, “Implementasi Minimum Redundancy Maximum Relevance sebagai Teknik Reduksi Dimensi pada Klasifikasi Kanker Usus Besar Menggunakan Random Forest,” vol. 7, no. 1, pp. 2490–2497, 2020.
M. S. L. & P. S. Farah Diba, “Analisis Random Forest Menggunakan Principal Component Analysis Pada Data Berdimensi Tinggi Farah,” Indones. J. Comput. Sci., vol. 12, no. 4, pp. 2152–2160, 2023.
Y. Xiao, J. Wu, Z. Lin, and X. Zhao, “A deep learning-based multi-model ensemble method for cancer prediction,” Comput. Methods Programs Biomed., vol. 153, pp. 1–9, 2018, doi: 10.1016/j.cmpb.2017.09.005.
O. Gal, N. Auslander, Y. Fan, and D. Meerzaman, “Predicting Complete Remission of Acute Myeloid Leukemia: Machine Learning Applied to Gene Expression,” Cancer Inform., vol. 18, 2019, doi: 10.1177/1176935119835544.
R. M. Awangga and N. H. Khonsa’, “Analisis Performa Algoritma Random Forest dan Naive Bayes Multinomial pada Dataset Ulasan Obat dan Ulasan Film,” InComTech J. Telekomun. dan Komput., vol. 12, no. 1, p. 60, 2022, doi: 10.22441/incomtech.v12i1.14770.
M. Arhami and M. Nasir, Data Mining Algoritma dan Implementasi. Yogyakarta: Penerbit Andi, 2020.
A. U. Zailani and N. L. Hanun, “Penerapan Algoritma Klasifikasi Random Forest Untuk Penentuan Kelayakan Pemberian Kredit Di Koperasi Mitra Sejahtera,” Infotech J. Technol. Inf., vol. 6, no. 1, pp. 7–14, 2020, doi: 10.37365/jti.v6i1.61.
C. Patgiri and A. Ganguly, “Adaptive thresholding technique based classification of red blood cell and sickle cell using Naïve Bayes Classifier and K-nearest neighbor classifier,” Biomed. Signal Process. Control, vol. 68, no. April, p. 102745, 2021, doi: 10.1016/j.bspc.2021.102745.
M. S. Santos, P. H. Abreu, S. Wilk, and J. Santos, “How distance metrics influence missing data imputation with k-nearest neighbours,” Pattern Recognit. Lett., vol. 136, pp. 111–119, 2020, doi: 10.1016/j.patrec.2020.05.032.
Downloads
Additional Files
Published
Issue
Section
License
The copyright of the article that accepted for publication shall be assigned to Jurnal Sisfokom (Sistem Informasi dan Komputer) and LPPM ISB Atma Luhur as the publisher of the journal. Copyright includes the right to reproduce and deliver the article in all form and media, including reprints, photographs, microfilms, and any other similar reproductions, as well as translations.
Jurnal Sisfokom (Sistem Informasi dan Komputer), LPPM ISB Atma Luhur, and the Editors make every effort to ensure that no wrong or misleading data, opinions or statements be published in the journal. In any way, the contents of the articles and advertisements published in Jurnal Sisfokom (Sistem Informasi dan Komputer) are the sole and exclusive responsibility of their respective authors.
Jurnal Sisfokom (Sistem Informasi dan Komputer) has full publishing rights to the published articles. Authors are allowed to distribute articles that have been published by sharing the link or DOI of the article. Authors are allowed to use their articles for legal purposes deemed necessary without the written permission of the journal with the initial publication notification from the Jurnal Sisfokom (Sistem Informasi dan Komputer).
The Copyright Transfer Form can be downloaded [Copyright Transfer Form Jurnal Sisfokom (Sistem Informasi dan Komputer).
This agreement is to be signed by at least one of the authors who have obtained the assent of the co-author(s). After submission of this agreement signed by the corresponding author, changes of authorship or in the order of the authors listed will not be accepted. The copyright form should be signed originally, and send it to the Editorial in the form of scanned document to sisfokom@atmaluhur.ac.id.