Comparative Analysis of Random Forest and Logistic Regression Methods in Predicting Leukemia Blood Cancer Using Microscopic Blood Cell Images
DOI:
https://doi.org/10.32736/sisfokom.v14i3.2393Keywords:
Random Forest, Logistic Regression, Prediction, Leukimia, Google ColabAbstract
Leukemia is one of the deadliest blood cancers that urgently requires early detection for effective treatment. However, conventional diagnosis methods are often subjective, time-consuming, and expensive, posing challenges especially in resource-constrained areas. This study presents a comprehensive comparative analysis of two widely-used machine learning algorithms - Random Forest (RF) and Logistic Regression (LR) - for leukemia prediction using an open-access dataset of 10,661 preprocessed microscopic blood cell images from Kaggle. The dataset was carefully partitioned into training (80%) and testing (20%) sets, with rigorous preprocessing including image normalization and feature extraction. Our evaluation incorporated multiple performance metrics: accuracy, sensitivity, specificity, and AUC. The results show that Random Forest's performance is superior with a classification accuracy of 85.23%, specificity of 0.9351, sensitivity of 0.6774, and AUC of 0.8881, significantly outperforming LR which achieved an accuracy of 78.11%, specificity of 0.8363, sensitivity of 0.6742, and AUC of 0.8120. These findings suggest that ensemble methods like RF are particularly well-suited for detecting one of the most deadly blood cancers, leukemia, due to their ability to handle complex feature interactions in medical imaging data. While both algorithms have potential as clinical decision support, future research can test deep learning techniques and larger datasets to improve the accuracy and reliability of the model.References
A. E. Aby, S. Salaji, K. K. Anilkumar, and T. Rajan, “A review on leukemia detection and classification using Artificial Intelligence-based techniques,” Comput. Biol. Med., vol. 169, p. 107630, 2024.
J. Ferlay, M. Colombet, I. Soerjomataram, D. M. Parkin, M. Piñeros, A. Znaor, and F. Bray, "Cancer statistics for the year 2020: An overview," Int. J. Cancer, vol. 149, no. 4, pp. 778–789, Aug. 2021, doi: 10.1002/ijc.33588.
Z. Cheng et al.,“Artificial intelligence reveals the predictions of hematological indexes in children with acute leukemia,” BMC Cancer, vol. 24, p. 993, 2024.
K. Kou et al., “Comprehensive Sepsis Risk Prediction in Leukemia Using a Random Forest Model and Restricted Cubic Spline Analysis” Front. Immunol., vol. 16, p. 1514273, 2025.
H. Liao, F. Zhang, F. Chen, Y. Li, Y. Sun, D. D. Sloboda, Q. Zheng, B. Ying, and T. Hu, "Application of artificial intelligence in laboratory hematology: Advances, challenges, and prospects," Clinica Chimica Acta, vol. 550, pp. 1–10, 2024, doi: 10.1016/j.cca.2023.117180..
L. Sari et al., “Penerapan Random Forest untuk Prediksi Penyakit Berdasarkan Data Hematologi,” J. Infotekmesin, vol. 14, no. 1, pp. 45–52, 2024.
H. Li et al., “Integrating Random Forest and Logistic Regression for Disease Classification: A Hybrid Approach,” Heliyon, vol. 10, no. 3, p. e25369, 2024.
K. Kashef et al., “Comparative analysis of ML algorithms for leukemia detection,” BMC Med. Inform. Decis. Mak., vol. 24, p. 122, 2024.
S. Triglycerides, “Leukemia Detection Using CNN and ML Approaches,” Indones. J. Comput. Sci., vol. 13, no. 3, pp. 4115–4125, 2024.
X. Fu et al., “ML models for acute leukemia detection: Performance comparison,” BMC Cancer, vol. 24, p. 993, 2024.
L. Narayanan, K. Santhana, R. Harold, and M. A. Banu, “Enhancing Acute Leukemia Classification Through Hybrid Fuzzy C-Means and Random Forest Methods,” Meas. Sens., vol. 39, p. 101876, 2025.
M. A. Khan, M. Sharif, M. Raza, and T. Saba, “A Machine Learning Framework for the Classification of Leukemia Using Microscopic Blood Images,” Microsc. Res. Tech., vol. 84, no. 12, pp. 2917–2928, 2021.
M. Fernández-Delgado, E. Cernadas, S. Barro, and D. Amorim, “Random Forest versus Logistic Regression: A Large-Scale Benchmark Experiment,” BMC Bioinform., vol. 19, p. 270, 2018.
L. Liu, Y. Zhang, and H. Li, “Application of logistic regression in hematological diagnosis,” BMC Cancer, vol. 24, p. 993, 2024.
N. H. Mahmood and D. H. Kadir, “Sparsity Regularization Enhances Gene Selection and Leukemia Subtype Classification via Logistic Regression,” Leuk. Res., vol. 150, p. 107663, 2025.
R. Roscher, B. Bohn, P. Feth, and B. Waske, “Explainable Machine Learning for Remote Sensing Applications,” IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–20, 2023.
S. Raschka, V. Mirjalili, and M. Khan, Python Machine Learning, 4th ed. Birmingham, UK: Packt Publishing, 2023.
Y. Liu, H. Yin, and Y. Zhang, “Comparative Study of Machine Learning Algorithms for Medical Image Classification,” J. Healthc. Eng., vol. 2023, pp. 1–10, 2023.
A. Géron, Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd ed. Sebastopol, CA: O’Reilly Media, 2023.
A. M. Najar, I. W. Sudarsana, M. U. Albab, and S. Andhika, “Machine Learning untuk Identifikasi Jenis Kanker Darah (Leukemia),” Vygotsky: J. Pendidik. Mat. dan Matematika, vol. 4, no. 1, pp. 47–56, 2022.
Y. Maulana, A. P. Nugroho, and D. D. Prasetyo, “Evaluasi Performa Machine Learning untuk Analisis Leukosit Abnormal Darah Tepi pada Penderita Acute Lymphoblastic Leukemia,” Laporan Penelitian, Universitas Gadjah Mada, 2022
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Jepri Banjarnahor, Galuh Wira Relungwangi

This work is licensed under a Creative Commons Attribution 4.0 International License.
The copyright of the article that accepted for publication shall be assigned to Jurnal Sisfokom (Sistem Informasi dan Komputer) and LPPM ISB Atma Luhur as the publisher of the journal. Copyright includes the right to reproduce and deliver the article in all form and media, including reprints, photographs, microfilms, and any other similar reproductions, as well as translations.
Jurnal Sisfokom (Sistem Informasi dan Komputer), LPPM ISB Atma Luhur, and the Editors make every effort to ensure that no wrong or misleading data, opinions or statements be published in the journal. In any way, the contents of the articles and advertisements published in Jurnal Sisfokom (Sistem Informasi dan Komputer) are the sole and exclusive responsibility of their respective authors.
Jurnal Sisfokom (Sistem Informasi dan Komputer) has full publishing rights to the published articles. Authors are allowed to distribute articles that have been published by sharing the link or DOI of the article. Authors are allowed to use their articles for legal purposes deemed necessary without the written permission of the journal with the initial publication notification from the Jurnal Sisfokom (Sistem Informasi dan Komputer).
The Copyright Transfer Form can be downloaded [Copyright Transfer Form Jurnal Sisfokom (Sistem Informasi dan Komputer).
This agreement is to be signed by at least one of the authors who have obtained the assent of the co-author(s). After submission of this agreement signed by the corresponding author, changes of authorship or in the order of the authors listed will not be accepted. The copyright form should be signed originally, and send it to the Editorial in the form of scanned document to sisfokom@atmaluhur.ac.id.