Comparison of the performance of the C.45 algorithm with Naive Bayes in analyzing book borrowing at the Pringsewu Muhammadiyah University Library

Authors

  • Dani Wilian IIB Darmajaya
  • Sriyanto Sriyanto IIB Darmajaya

Keywords:

Book Borrowing Patterns, C4.5, Naive Bayes, Datamining

Abstract

This research investigates the effectiveness of the Naïve Bayes and C4.5 algorithms in analyzing book borrowing patterns at the Pringsewu Muhammadiyah University Library. As libraries evolve into important educational resource centers, understanding user borrowing behavior becomes critical for effective collection management and service improvement. This research uses the Cross-Industry Standard Process for Data Mining (CRISP-DM) to guide the research stages, including business understanding, data understanding, preparation, modeling, evaluation, and implementation. A dataset consisting of 5,586 records and ten attributes related to book lending was used, with thorough data cleaning and preprocessing performed. The performance of both algorithms was evaluated using K-fold cross validation, resulting in a C4.5 accuracy of 96.26% compared to 91.44% for Naïve Bayes. These results demonstrate that C4.5 excels at capturing complex relationships in data, providing valuable insights into user preferences and improving library services. This research highlights the potential of data mining techniques to improve library management and suggests directions for future research, including exploration of advanced machine learning algorithms and expansion of data sets for broader libraries.

Author Biographies

Dani Wilian, IIB Darmajaya

Fakultas Ilmu Komputer

Sriyanto Sriyanto, IIB Darmajaya

Fakultas Ilmu Komputer

References

L. Yulita, A. S. Sunge, and N. Nurhidayanti, “Optimasi Algoritma Genetika Dalam Memprediksi Minat Baca Siswa Pada Perpustakaan SMK Negeri 1 Gantar Dengan Metode Decision Tree.”

E. Irfiani, Y. Kusnadi, S. Sunarti, and F. Handayanna, “Implementasi Data Mining dalam Mengklasifikasi Minat Baca Pada Perpustakaan Daerah Menggunakan Algoritma C4.5,” JOINS (Journal of Information System), vol. 8, no. 2, pp. 106–114, Nov. 2023, doi: 10.33633/joins.v8i2.8004.

T. T. Sang Nguyen, “Model-based book recommender systems using Naïve Bayes enhanced with optimal feature selection,” in ACM International Conference Proceeding Series, Association for Computing Machinery, 2019, pp. 217–222. doi: 10.1145/3316615.3316727.

L. Feng, “Research on Higher Education Evaluation and Decision-Making Based on Data Mining,” Sci Program, vol. 2021, 2021, doi: 10.1155/2021/6195067.

D. Berrar, “Bayes’ Theorem and Naive Bayes Classifier,” in Encyclopedia of Bioinformatics and Computational Biology, S. Ranganathan, M. Gribskov, K. Nakai, and C. Schönbach, Eds., Oxford: Academic Press, 2019, pp. 403–412. doi: https://doi.org/10.1016/B978-0-12-809633-8.20473-1.

J. Moolayil, Learn Keras for Deep Neural Networks. Apress, 2019. doi: 10.1007/978-1-4842-4240-7.

S. Xuanyuan, S. Xuanyuan, and Y. Yue, “Application of C4.5 Algorithm in Insurance and Financial Services Using Data Mining Methods,” Mobile Information Systems, vol. 2022, 2022, doi: 10.1155/2022/5670784.

X. Xiongjun and D. Lv, “The Evaluation of Music Teaching in Colleges and Universities Based on Machine Learning,” Journal of Mathematics, vol. 2022, 2022, doi: 10.1155/2022/2678303.

M. Hussain, W. Zhu, W. Zhang, and S. M. R. Abidi, “Student Engagement Predictions in an e-Learning System and Their Impact on Student Course Assessment Scores,” Comput Intell Neurosci, vol. 2018, 2018, doi: 10.1155/2018/6347186.

W. Liu et al., “Using a classification model for determining the value of liver radiological reports of patients with colorectal cancer,” Front Oncol, vol. 12, Nov. 2022, doi: 10.3389/fonc.2022.913806.

P. V. Ngoc, C. V. T. Ngoc, T. V. T. Ngoc, and D. N. Duy, “A C4.5 algorithm for english emotional classification,” Evolving Systems, vol. 10, no. 3, pp. 425–451, Sep. 2019, doi: 10.1007/s12530-017-9180-1.

J. Wang, “Application of C4.5 Decision Tree Algorithm for Evaluating the College Music Education,” Mobile Information Systems, vol. 2022, 2022, doi: 10.1155/2022/7442352.

Y. A. Alsariera, Y. Baashar, G. Alkawsi, A. Mustafa, A. A. Alkahtani, and N. Ali, “Assessment and Evaluation of Different Machine Learning Algorithms for Predicting Student Performance,” 2022, Hindawi Limited. doi: 10.1155/2022/4151487.

X. Yang and J. Ge, “Predicting Student Learning Effectiveness in Higher Education Based on Big Data Analysis,” Mobile Information Systems, vol. 2022, 2022, doi: 10.1155/2022/8409780.

Parteek Bhatia, “Data Mining and Data Warehousing,” 2019.

D. Forsyth, “Probability and Statistics for Computer Science,” 2018.

T. Sinta Peringkat et al., “KOMPARASI ALGORITMA DECISION TREE, NAIVE BAYES DAN K-NEAREST NEIGHBOR UNTUK MEMPREDIKSI MAHASISWA LULUS TEPAT WAKTU,” 2020, [Online]. Available: www.bri-institute.ac.id

S. Alim, “IMPLEMENTASI ORANGE DATA MINING UNTUK KLASIFIKASI KELULUSAN MAHASISWA DENGAN MODEL K-NEAREST NEIGHBOR, DECISION TREE SERTA NAIVE BAYES ORANGE DATA MINING IMPLEMENTATION FOR STUDENT GRADUATION CLASSIFICATION USING K-NEAREST NEIGHBOR, DECISION TREE AND NAIVE BAYES MODELS,” 2021.

T. Masters, Data Mining Algorithms in C++. Apress, 2018. doi: 10.1007/978-1-4842-3315-3.

Nurmalitasari, Z. Awang Long, and M. Faizuddin Mohd Noor, “Factors Influencing Dropout Students in Higher Education,” Educ Res Int, vol. 2023, 2023, doi: 10.1155/2023/7704142.

O. Caelen, “A Bayesian Interpretation of the Confusion Matrix,” 2017.

Published

2024-12-13

Issue

Section

Articles