Implementation of Data Mining to Predict Student Study Period with Decision Tree Algorithm (C4.5)

Authors

  • Kirana Alyssa Putri Informatics Engineering, FTII Prof. DR. Hamka Muhammadiyah University
  • Dimas Febriawan Informatics Engineering, FTII Prof. DR. Hamka Muhammadiyah University
  • Firman Noor Hasan Informatics Engineering, FTII Prof. DR. Hamka Muhammadiyah University https://orcid.org/0000-0002-1246-3462

DOI:

https://doi.org/10.32736/sisfokom.v13i1.1943

Keywords:

Decision Tree, C4.5 Algorithm, Prediction, Study Period, RapidMiner

Abstract

Graduating on time is what every student wants to accomplish in college. Students of Prof. Dr. Hamka Muhammadiyah University are one of those who have this dream. Based on 2020 graduates data from the Tracer Study, 60% said the university had a high enough impact  on improving competence.  This data indicates that university needs to evaluate improvement of academic quality. Often, students have difficulty finding information about important factors that support achieving timely graduation. A prediction analysis is needed to provide information about the student's graduation study period. For this analysis, data mining is implemented using the classification function of the decision tree (C4.5) algorithm with RapidMiner tools. The methodology for implementing data mining follows the stages of Knowledge Discovery In Database (KDD), beginning with data collection, preprocessing, transformation, data mining, and evaluation. The research findings consist of visualization and decision tree rules that reveal GPA as the most influential factor in determining a student's study period.There is other information, namely, students graduated on time (less than equal to 4 years) amounted to 170 or 54.5% and students did not graduate on time (more than 4 years) amounted to 142 or 45.6%. Testing the performance of decision tree (C4.5) utilizing confusion matrix through RapidMiner tools, resulted in accuracy reaching 83.87%, with precision of 87.50% and recall of 91.18%. Provides evidence that the decision tree algorithm (C4.5) has optimal performance to provide valuable information about predicting student graduation in order to increase student enrollment with the right study period.

Author Biographies

Kirana Alyssa Putri, Informatics Engineering, FTII Prof. DR. Hamka Muhammadiyah University

Program Studi Teknik Informatika

Dimas Febriawan, Informatics Engineering, FTII Prof. DR. Hamka Muhammadiyah University

Program Studi Teknik Informatika

Firman Noor Hasan, Informatics Engineering, FTII Prof. DR. Hamka Muhammadiyah University

Program Studi Teknik Informatika

References

T. M. S. Manurung, “Pengaruh Motivasi dan Perilaku Belajar Terhadap Prestasi Akademik Mahasiswa,” JAS-PT J. Anal. Sist. Pendidik. Tinggi, vol. 1, no. 1, p. 17, 2017, doi: 10.36339/jaspt.v1i1.36.

I. G. A. M. Srinadi and D. P. E. Nilakusmawati, “Analisis Waktu Kelulusan Mahasiswa Fmipa Universitas Udayana Dan Faktor-Faktor Yang Memengaruhinya,” E-Jurnal Mat., vol. 9, no. 3, p. 205, 2020, doi: 10.24843/mtk.2020.v09.i03.p300.

E. Etriyanti, “Perbandingan Tingkat Akurasi Metode KNN Dan Decision Tree Dalam Memprediksi Lama Studi Mahasiswa,” J. Ilm. Bin. STMIK Bina Nusant. Jaya Lubuklinggau, vol. 3, no. 1, pp. 6–14, 2021, doi: 10.52303/jb.v3i1.40.

F. N. Hasan and A. Febriandirza, “Perancangan Data Warehouse Untuk Data Penelitian Di Perguruan Tinggi Menggunakan Pendekatan Nine Steps Methodologhy,” Pseudocode, vol. 8, no. 1, pp. 49–57, 2021, doi: 10.33369/pseudocode.8.1.49-57.

F. N. Hasan and I. ketut Sudaryana, “Penerapan Business Intelligence & Online Analytical Processing untuk Data-Data Penelitian dan Luarannya pada Perguruan Tinggi Menggunakan Pentaho,” Infotech J. Technol. Inf., vol. 8, no. 2, pp. 85–92, 2022, doi: 10.37365/jti.v8i2.143.

A. Algarni, “Data Mining in Education,” Data Min. Educ., vol. 7, no. 6, pp. 58–77, 2016, doi: 10.4018/978-1-5225-1877-8.ch005.

A. O. P. Dewi, “Big Data di Perpustakaan dengan Memanfaatkan Data Mining,” Anuva J. Kaji. Budaya, Perpustakaan, dan Inf., vol. 4, no. 2, pp. 223–230, 2020, doi: 10.14710/anuva.4.2.223-230.

X. Dong and B. Xiang, “Data Mining Analysis for Improving Decision-Making in Computer Management Information Systems,” Scalable Comput. Pract. Exp., vol. 24, no. 4, pp. 673–685, 2023, doi: 10.12694/scpe.v24i4.2147.

M. Azhari, Z. Situmorang, and R. Rosnelly, “Perbandingan Akurasi, Recall, dan Presisi Klasifikasi pada Algoritma C4.5, Random Forest, SVM dan Naive Bayes,” J. Media Inform. Budidarma, vol. 5, no. 2, p. 640, 2021, doi: 10.30865/mib.v5i2.2937.

L. Y. Lumban Gaol, M. Safii, and D. Suhendro, “Prediksi Kelulusan Mahasiswa Stikom Tunas Bangsa Prodi Sistem Informasi Dengan Menggunakan Algoritma C4.5,” Brahmana J. Penerapan Kecerdasan Buatan, vol. 2, no. 2, pp. 97–106, 2021, doi: 10.30645/brahmana.v2i2.71.

M. R. Fahdia, D. Riana, F. Amsury, I. Saputra, and N. Ruhyana, “Komparasi Algoritma Klasifikasi untuk Orientasi Minat Mahasiswa dalam Penuntasan Studi,” JIRA J. Inov. dan Ris. Akad., vol. 2, no. 7, pp. 970–1007, 2021, doi: 10.47387/jira.v2i7.185.

Y. Mardi, “Data Mining : Klasifikasi Menggunakan Algoritma C4 . 5,” J. Edik Inform., vol. 2, no. 2, pp. 213–219, 2019, doi: https://doi.org/10.22202/ei.2016.v2i2.1465.

S. Widaningsih, “Perbandingan Metode Data Mining Untuk Prediksi Nilai Dan Waktu Kelulusan Mahasiswa Prodi Teknik Informatika Dengan Algoritma C4,5, Naïve Bayes, Knn Dan Svm,” J. Tekno Insentif, vol. 13, no. 1, pp. 16–25, 2019, doi: 10.36787/jti.v13i1.78.

Y. S. Luvia, D. Hartama, A. P. Windarto, and Solikhun, “Penerapan Algoritma C4.5 Untuk Klasifikasi Predikat Keberhasilan Mahasiswa Di Amik Tunas Bangsa,” Jurasik (Jurnal Ris. Sist. Inf. dan Tek. Inform., vol. 1, no. 1, p. 75, 2017, doi: 10.30645/jurasik.v1i1.12.

M. Ardiansyah, A. Sunyoto, and E. T. Luthfi, “Analisis Perbandingan Akurasi Algoritma Naïve Bayes Dan C4.5 untuk Klasifikasi Diabetes,” Edumatic J. Pendidik. Inform., vol. 5, no. 2, pp. 147–156, 2021, doi: 10.29408/edumatic.v5i2.3424.

E. Etriyanti, D. Syamsuar, and Y. N. Kunang, “Implementasi Data Mining Menggunakan Algoritme Naive Bayes Classifier dan C4.5 untuk Memprediksi Kelulusan Mahasiswa,” Telematika, vol. 13, no. 1, pp. 56–67, 2020, doi: 10.35671/telematika.v13i1.881.

A. Wibowo and A. Rohman, “Prediksi Predikat Kelulusan Mahasiswa Menggunakan Naive Bayes dan Decision Tree pada Universitas XYZ,” Expert J. Manaj. Sist. Inf. dan Teknol., vol. 12, no. 2, p. 104, 2022, doi: 10.36448/expert.v12i2.2810.

Y. Yang, “The Evaluation of Online Education Course Performance Using Decision Tree Mining Algorithm,” Complexity, vol. 2021, 2021, doi: 10.1155/2021/5519647.

A. H. Nasrullah, “Implementasi Algoritma Decision Tree Untuk Klasifikasi Produk Laris,” J. Pilar Nusa Mandiri, vol. 7, no. 2, p. 217, 2021, doi: https://doi.org/10.35329/jiik.v7i2.203.

A. Safira and F. N. Hasan, “Analisis Sentimen Masyarakat Terhadap Paylater Menggunakan Metode Naive Bayes Classifier,” Zo. J. Sist. Inf., vol. 5, no. 1, pp. 59–70, 2023, doi: 10.31849/zn.v5i1.12856.

D. Anggraeni and T. Christy, “Analisa Kinerja Algoritma C4.5 Dalam Menentukan Pola Dominasi Mainstream Mahasiswa,” Jurikom, vol. 6, no. 4, pp. 333–339, 2019, doi: http://dx.doi.org/10.30865/jurikom.v6i4.1333.

S. U. Putri, E. Irawan, and F. Rizky, “Implementasi Data Mining Untuk Prediksi Penyakit Diabetes Dengan Algoritma C4.5,” J. Penerapan Sist. Inf. (Komputer Manajemen), vol. 2, no. 1, pp. 39–46, 2021, doi: https://doi.org/10.30645/kesatria.v2i1.56.

W. Wang, “Model Construction and Research on Decision Support System for Education Management Based on Data Mining,” Comput. Intell. Neurosci., vol. 2021, 2021, doi: 10.1155/2021/9056947.

I. Sutoyo, “Implementasi Algoritma Decision Tree Untuk Klasifikasi Data Peserta Didik,” vol. 14, no. 2, pp. 217–224, 2018, doi: https://doi.org/10.33480/pilar.v14i2.70.

D. A. Ningtyas, M. Wahyudi, and N. Nurajijah, “Klasifikasi Siswa Smk Berpotensi Putus Sekolah Menggunakan Algoritma Decision Tree, Support Vector Machine Dan Naive Bayes,” J. Khatulistiwa Inform., vol. 7, no. 2, pp. 85–90, 2019, doi: 10.31294/jki.v7i2.6839.

D. Ardiansyah and W. Walim, “Algoritma C4.5 Untuk Klasifikasi Calon Peserta Lomba Cerdas Cermat Siswa Smp Dengan Menggunakan Aplikasi Rapid Miner,” J. Inkofar, vol. 1, no. 2, pp. 5–12, 2019, doi: 10.46846/jurnalinkofar.v1i2.29.

A. Surahmat and M. Sutrisno, “Analisis Kepuasan Pelanggan Dalam Industri Teknologi Menggunakan Algoritma C4.5,” Anal. Kepuasan Pelangg. Dalam Ind. Teknol. Menggunakan Algoritm. C4.5, vol. 13, no. 2, pp. 75–79, 2023, doi: https://doi.org/10.24853/justit.13.2.75-79.

F. Sidik, I. Suhada, A. H. Anwar, and F. N. Hasan, “Analisis Sentimen Terhadap Pembelajaran Daring Dengan Algoritma Naive Bayes Classifier,” J. Linguist. Komputasional, vol. 5, no. 1, p. 34, 2022, doi: 10.26418/jlk.v5i1.79.

S. Faisal, “Klasifikasi Data Minning Menggunakan Algoritma C4.5 Terhadap Kepuasan Pelanggan Sewa Kamera Cikarang,” Techno Xplore J. Ilmu Komput. dan Teknol. Inf., vol. 4, no. 1, pp. 1–8, 2019, doi: 10.36805/technoxplore.v4i1.541.

I. R. Afandi, F. Noor Hasan, A. A. Rizki, N. Pratiwi, and Z. Halim, “Analisis Sentimen Opini Masyarakat Terkait Pelayanan Jasa Ekspedisi Anteraja Dengan Metode Naive Bayes,” J. Linguist. Komputasional, vol. 5, no. 2, pp. 63–70, 2022, doi: https://doi.org/10.26418/jlk.v5i2.107.

L. A. R. Malik, M. Kamayani, and F. N. Hasan, “Faktor-faktor Yang Mempengaruhi Minat Calon Mahasiswa Baru Mendaftar Pada FTII Uhamka Menggunakan Algoritma K-Nearest Neighbor (K-NN),” vol. 9, no. 1, 2023, doi: https://doi.org/10.37365/jti.v9i1.163.

Downloads

Published

2024-02-12

Issue

Section

Articles