Classification Comparison Performance of Supervised Machine Learning Random Forest and Decision Tree Algorithms Using Confusion Matrix
DOI:
https://doi.org/10.32736/sisfokom.v13i1.1985Keywords:
Data Mining, Classification, Random Forest, Decision Tree, Confusion MatrixAbstract
The classification method is part of data mining which is used to predict existing problems and also as predictions for the future. The form of dataset used in the classification method is supervised data. The random forest classification method is processed by forming several decision trees and then combining them to get better and more precise predictions. while a decision tree is the concept of changing a pile of data into a decision tree that presents the rules of a decision. From these two classification methods, researchers will compare the level of accuracy of predictions from both methods with the same dataset, namely the employee dataset in India, to predict the level of accuracy of employees who leave their jobs or still remain to work at their company. The number of records available is 4654 records. Of the existing data, 90% was used as training data and 10% was used as test data. From the results of testing this method, it was found that the accuracy level of the random forest method was 86.45%, while the decision tree method was 84.30% accuracy level. Then, by using the confusion matrix, you can see the magnitude of the distribution of experimental validity visually to calculate precision, recall and F1-Score. The random forest algorithm obtained precision of: 96.7%, sensitivity of: 84.7%, specificity of: 91.4%, and F1-Score of: 90.2%. Meanwhile, the decision tree algorithm obtained precision of: 95.7%, sensitivity of: 82.9%, specificity of: 88.4%, and F1-Score of: 88.8%.References
P. Han, Kamber, Data Mining Concepts and Techniques. 2012.
F. A. Hermawati, “Data Mining,” no. January, 2018.
T. Lan, H. Hu, C. Jiang, G. Yang, and Z. Zhao, “ScienceDirect A comparative study of decision tree , random forest , and convolutional neural network for spread-F identification,” Adv. Sp. Res., vol. 65, no. 8, pp. 2052–2061, 2020, doi: 10.1016/j.asr.2020.01.036.
L. Breiman, “Random Forest,” pp. 1–33, 2001.
A. Prabowo, S. Wardani, R. W. Dewantoro, and W. Wesly, “Komparasi Tingkat Akurasi Random Forest dan Decision Tree C4.5 Pada Klasifikasi Data Penyakit Infertilitas,” vol. 4, no. 1, pp. 218–224, 2023, doi: 10.30865/klik.v4i1.1115.
C. Science, “U niversity of L iège,” no. July, 2014.
C. Curtis, C. Liu, T. J. Bollerman, and O. S. Pianykh, “Machine Learning for Predicting Patient Wait Times and Appointment Delays,” J. Am. Coll. Radiol., no. Ml, pp. 1–7, 2017, doi: 10.1016/j.jacr.2017.08.021.
P. Bhargav and K. Sashirekha, “A Machine Learning Method for Predicting Loan Approval by Comparing the Random Forest and Decision Tree Algorithms .,” vol. 10, pp. 1803–1813, 2023.
N. Sunanto and G. Falah, “Penerapan Algoritma C4.5 Untuk Membuat Model Prediksi Pasien Yang Mengidap Penyakit Diabetes,” Rabit J. Teknol. dan Sist. Inf. Univrab, vol. 7, no. 2, pp. 208–216, 2022, doi: 10.36341/rabit.v7i2.2435.
R. Estian Pambudi, Sriyanto, and Firmansyah, “Klasifikasi Penyakit Stroke Menggunakan Algoritma Decision Tree C.45,” Ijccs, vol. x, No.x, no. x, pp. 1–5, 2022.
M. Ardiansyah, A. Sunyoto, and E. T. Luthfi, “Analisis Perbandingan Akurasi Algoritma Naïve Bayes Dan C4.5 untuk Klasifikasi Diabetes,” Edumatic J. Pendidik. Inform., vol. 5, no. 2, pp. 147–156, 2021, doi: 10.29408/edumatic.v5i2.3424.
Saifullah, Muhammad Zarlis, Zakaria Zakaria, Rahmat Widia Sembiring, “Analisa Terhadap Perbandingan Algoritma Decision Tree Dengan Algoritma Random Tree Untuk Pre-Prosesing Data,” J-SAKTI (Jurnal Sains Komputer & Informatika), Vol 1, No 2 (2017)
Svetnik V 2003 Random forest: a classification and regression tool for compound classification and QSAR modeling J. Journal of Chemical Information & Computer Sciences 1 43
Alvita Izana Kusumarini, Pandu Ananto Hogantara, Muammar Fadhlurohman, Nurul Chamidah, “Perbandingan Algoritma Random Forest, Naive Bayes, Dan Decision Tree Dengan Oversampling Untuk Klasifikasi Bakteri E.Coli,” Prosiding SENAMIKA, Vol 2, No 1 (2021)
Mehul Madaan, Aniket Kumar, Chirag Keshri, Rachna Jain and Preeti Nagrath “ Loan default prediction using decision trees and random forest: A comparative study ” 1st International Conference on Computational Research and Data Analytics (ICCRDA 2020) 24th October 2020, Rajpura, India
Simon Heqelich, “ Decision Trees and Random Forests : Machine Learning Techniques to Classify Rare Events” Vol 2, Issue 1 Spring 2016
Downloads
Additional Files
Published
Issue
Section
License
The copyright of the article that accepted for publication shall be assigned to Jurnal Sisfokom (Sistem Informasi dan Komputer) and LPPM ISB Atma Luhur as the publisher of the journal. Copyright includes the right to reproduce and deliver the article in all form and media, including reprints, photographs, microfilms, and any other similar reproductions, as well as translations.
Jurnal Sisfokom (Sistem Informasi dan Komputer), LPPM ISB Atma Luhur, and the Editors make every effort to ensure that no wrong or misleading data, opinions or statements be published in the journal. In any way, the contents of the articles and advertisements published in Jurnal Sisfokom (Sistem Informasi dan Komputer) are the sole and exclusive responsibility of their respective authors.
Jurnal Sisfokom (Sistem Informasi dan Komputer) has full publishing rights to the published articles. Authors are allowed to distribute articles that have been published by sharing the link or DOI of the article. Authors are allowed to use their articles for legal purposes deemed necessary without the written permission of the journal with the initial publication notification from the Jurnal Sisfokom (Sistem Informasi dan Komputer).
The Copyright Transfer Form can be downloaded [Copyright Transfer Form Jurnal Sisfokom (Sistem Informasi dan Komputer).
This agreement is to be signed by at least one of the authors who have obtained the assent of the co-author(s). After submission of this agreement signed by the corresponding author, changes of authorship or in the order of the authors listed will not be accepted. The copyright form should be signed originally, and send it to the Editorial in the form of scanned document to sisfokom@atmaluhur.ac.id.