Comparison of the Performance of the C.45 Algorithm with Naive Bayes in Analyzing Book Borrowing at the Library Pringsewu Muhammadiyah University

Authors

  • Dani Wilian Departmen of Computer Science, Institut Informatika dan Bisnis Darmajaya , Lampung
  • Sriyanto Sriyanto Departmen of Computer Science, Institut Informatika dan Bisnis Darmajaya , Lampung

DOI:

https://doi.org/10.32736/sisfokom.v14i1.2300

Keywords:

Book Borrowing Patterns, C4.5, Naive Bayes, Datamining

Abstract

This study examines the effectiveness of the Naïve Bayes and C4.5 algorithms in analyzing book borrowing patterns at the Pringsewu Muhammadiyah University Library. As libraries increasingly serve as vital educational hubs, understanding user borrowing behavior is essential for effective collection management and service enhancement. The research follows the Cross-Industry Standard Process for Data Mining (CRISP-DM), which includes stages of business understanding, data understanding, preparation, modeling, evaluation, and implementation. A dataset consisting of 5,586 records and ten attributes related to book lending was utilized, with comprehensive data cleaning and preprocessing conducted. The performance of both algorithms was assessed using K-fold cross-validation, yielding an accuracy of 96.26% for C4.5, compared to 91.44% for Naïve Bayes. These results demonstrate that C4.5 is more adept at capturing complex relationships within the data, providing deeper insights into user preferences and enhancing library services. This research underscores the potential of data mining techniques to optimize library management and proposes avenues for future investigation, such as exploring advanced machine learning algorithms and expanding datasets for use in broader library contexts.

Author Biographies

Dani Wilian, Departmen of Computer Science, Institut Informatika dan Bisnis Darmajaya , Lampung

Fakultas Ilmu Komputer

Sriyanto Sriyanto, Departmen of Computer Science, Institut Informatika dan Bisnis Darmajaya , Lampung

Fakultas Ilmu Komputer

References

L. Yulita, A. S. Sunge, and N. Nurhidayanti, “Optimasi Algoritma Genetika Dalam Memprediksi Minat Baca Siswa Pada Perpustakaan SMK Negeri 1 Gantar Dengan Metode Decision Tree.”

E. Irfiani, Y. Kusnadi, S. Sunarti, and F. Handayanna, “Implementasi Data Mining dalam Mengklasifikasi Minat Baca Pada Perpustakaan Daerah Menggunakan Algoritma C4.5,” JOINS (Journal of Information System), vol. 8, no. 2, pp. 106–114, Nov. 2023, doi: 10.33633/joins.v8i2.8004.

T. T. Sang Nguyen, “Model-based book recommender systems using Naïve Bayes enhanced with optimal feature selection,” in ACM International Conference Proceeding Series, Association for Computing Machinery, 2019, pp. 217–222. doi: 10.1145/3316615.3316727.

L. Feng, “Research on Higher Education Evaluation and Decision-Making Based on Data Mining,” Sci Program, vol. 2021, 2021, doi: 10.1155/2021/6195067.

D. Berrar, “Bayes’ Theorem and Naive Bayes Classifier,” in Encyclopedia of Bioinformatics and Computational Biology, S. Ranganathan, M. Gribskov, K. Nakai, and C. Schönbach, Eds., Oxford: Academic Press, 2019, pp. 403–412. doi: https://doi.org/10.1016/B978-0-12-809633-8.20473-1.

J. Moolayil, Learn Keras for Deep Neural Networks. Apress, 2019. doi: 10.1007/978-1-4842-4240-7.

S. Xuanyuan, S. Xuanyuan, and Y. Yue, “Application of C4.5 Algorithm in Insurance and Financial Services Using Data Mining Methods,” Mobile Information Systems, vol. 2022, 2022, doi: 10.1155/2022/5670784.

X. Xiongjun and D. Lv, “The Evaluation of Music Teaching in Colleges and Universities Based on Machine Learning,” Journal of Mathematics, vol. 2022, 2022, doi: 10.1155/2022/2678303.

M. Hussain, W. Zhu, W. Zhang, and S. M. R. Abidi, “Student Engagement Predictions in an e-Learning System and Their Impact on Student Course Assessment Scores,” Comput Intell Neurosci, vol. 2018, 2018, doi: 10.1155/2018/6347186.

W. Liu et al., “Using a classification model for determining the value of liver radiological reports of patients with colorectal cancer,” Front Oncol, vol. 12, Nov. 2022, doi: 10.3389/fonc.2022.913806.

P. V. Ngoc, C. V. T. Ngoc, T. V. T. Ngoc, and D. N. Duy, “A C4.5 algorithm for english emotional classification,” Evolving Systems, vol. 10, no. 3, pp. 425–451, Sep. 2019, doi: 10.1007/s12530-017-9180-1.

J. Wang, “Application of C4.5 Decision Tree Algorithm for Evaluating the College Music Education,” Mobile Information Systems, vol. 2022, 2022, doi: 10.1155/2022/7442352.

Y. A. Alsariera, Y. Baashar, G. Alkawsi, A. Mustafa, A. A. Alkahtani, and N. Ali, “Assessment and Evaluation of Different Machine Learning Algorithms for Predicting Student Performance,” 2022, Hindawi Limited. doi: 10.1155/2022/4151487.

X. Yang and J. Ge, “Predicting Student Learning Effectiveness in Higher Education Based on Big Data Analysis,” Mobile Information Systems, vol. 2022, 2022, doi: 10.1155/2022/8409780.

Parteek Bhatia, “Data Mining and Data Warehousing,” 2019.

D. Forsyth, “Probability and Statistics for Computer Science,” 2018.

T. Sinta Peringkat et al., “KOMPARASI ALGORITMA DECISION TREE, NAIVE BAYES DAN K-NEAREST NEIGHBOR UNTUK MEMPREDIKSI MAHASISWA LULUS TEPAT WAKTU,” 2020, [Online]. Available: www.bri-institute.ac.id

S. Alim, “IMPLEMENTASI ORANGE DATA MINING UNTUK KLASIFIKASI KELULUSAN MAHASISWA DENGAN MODEL K-NEAREST NEIGHBOR, DECISION TREE SERTA NAIVE BAYES ORANGE DATA MINING IMPLEMENTATION FOR STUDENT GRADUATION CLASSIFICATION USING K-NEAREST NEIGHBOR, DECISION TREE AND NAIVE BAYES MODELS,” 2021.

T. Masters, Data Mining Algorithms in C++. Apress, 2018. doi: 10.1007/978-1-4842-3315-3.

Nurmalitasari, Z. Awang Long, and M. Faizuddin Mohd Noor, “Factors Influencing Dropout Students in Higher Education,” Educ Res Int, vol. 2023, 2023, doi: 10.1155/2023/7704142.

O. Caelen, “A Bayesian Interpretation of the Confusion Matrix,” 2017.

Downloads

Published

2025-01-31

Issue

Section

Articles