Optimizing Procurement Efficiency by Implementing K-Means and Random Forest in Kopegtel Samarinda’s Warehouse System

Fernando Nikolas R; ⠀Islamiyah Islamiyah; Vina Zahrotun Kamila

doi:10.32736/sisfokom.v13i3.2288

Authors

Fernando Nikolas R Department of Information Systems, Department of Engineering Faculty, Mulawarman University
⠀Islamiyah Islamiyah Department of Information Systems, Department of Engineering Faculty, Mulawarman University
Vina Zahrotun Kamila Department of Information Systems, Department of Engineering Faculty, Mulawarman University

DOI:

https://doi.org/10.32736/sisfokom.v13i3.2288

Keywords:

CRISP-DM, K-Means, Procurement, Random Forest, Warehouse

Abstract

Procurement is a company’s activity to purchase goods or equipment needed in operations. In the management process, a procurement management system is often used to facilitate this management, such as at CV Indocitra Multi Artha, which uses the “Sistem Warehouse Kopegtel Samarinda.” The system provides significant assistance to the company, but large requests can be overwhelming to be handled by the manager and can cause an overload information problem. Research was conducted to deal with these problems by implementing a data mining algorithm as a procurement recommendation system. K-means and Random Forest algorithms were chosen as methods for the research. The algorithm is processed within two critical steps, first by K-Means to get cluster data and then by predicting it with Random Forest to get a recommendation for whether the object should be bought or not. Hyperparameter tuning was performed to optimize the model’s performance, yielding an F1-Score of 86.95%, representing the balance between precision and recall, and an ROC AUC value of 82.34%. These substantial metric outcomes indicate that the model can provide practical recommendations

References

G. Sugiyanto et al., Manajemen Sistem Informasi, 1st ed. Padang: Global Eksekutif Teknologi, 2022.

D. García-Barrios, K. Palomino, E. García-Solano, and A. Cuello-Quiroz, “A Machine Learning Based Method for Managing Multiple Impulse Purchase Products: an Inventory Management Approach,” J. Eng. Sci. Technol. Rev., vol. 14, no. 1, pp. 25–37, 2021, doi: 10.25103/jestr.141.02.

Z. Fayyaz, M. Ebrahimian, D. Nawara, A. Ibrahim, and R. Kashef, “Recommendation Systems: Algorithms, Challenges, Metrics, and Business Opportunities,” Appl. Sci., vol. 10, no. 21, pp. 1–20, 2020, doi: 10.3390/app10217748.

J. M. Spreitzenbarth, C. Bode, and H. Stuckenschmidt, “Artificial Intelligence and Machine Learning in Purchasing and Supply Management: a Mixed-methods Review of the State-of-the-art in Literature and Practice,” J. Purch. Supply Manag., vol. 30, no. 1, p. 100896, 2024, doi: 10.1016/j.pursup.2024.100896.

S. S. Yassin and Pooja, “Road Accident Prediction and Model Interpretation Using a Hybrid K-means and Random Forest Algorithm Approach,” SN Appl. Sci., vol. 2, no. 9, pp. 1–13, 2020, doi: 10.1007/s42452-020-3125-1.

H. A. Elzeheiry, S. Barakat, and A. Rezk, “Different Scales of Medical Data Classification Based on Machine Learning Techniques: a Comparative Study,” Appl. Sci., vol. 12, no. 2, p. 919, Jan. 2022, doi: 10.3390/app12020919.

R. W. S. B. Brahmana, F. A. Mohammed, and K. Chairuang, “Customer Segmentation Based on RFM Model Using K-Means, K-Medoids, and DBSCAN Methods,” Lontar Komput. J. Ilm. Teknol. Inf., vol. 11, no. 1, p. 32, 2020, doi: 10.24843/lkjiti.2020.v11.i01.p04.

A. Pajankar and A. Joshi, Introduction to Machine Learning with Scikit-learn. Berkeley, CA: Apress, 2022. doi: 10.1007/978-1-4842-7921-2_5.

V. Z. Kamila and E. Subastian, “KNN vs Naive Bayes Untuk Deteksi Dini Putus Kuliah Pada Profil Akademik Mahasiswa,” J. Rekayasa Teknol. Inf., vol. 3, no. 2, pp. 116–121, 2019, doi: 10.30872/jurti.v3i2.3097.

V. Z. Kamila, E. Subastian, and Rosmasari, “KNN and Naive Bayes for Optional Advanced Courses Recommendation,” in 2019 International Conference on Electrical, Electronics and Information Engineering (ICEEIE), Oct. 2019, pp. 306–309. doi: 10.1109/ICEEIE47180.2019.8981450.

IBM, IBM SPSS Modeller CRISP-DM Guide, 18.4. New York, NY: IBM, 2023. [Online]. Available: https://www.ibm.com/docs/it/SS3RA7_18.4.0/pdf/ModelerCRISPDM.pdf

C. Schröer, F. Kruse, and J. M. Gómez, “A Systematic Literature Review on Applying CRISP-DM Process Model,” Procedia Comput. Sci., vol. 181, no. 2019, pp. 526–534, 2021, doi: 10.1016/j.procs.2021.01.199.

J. L. Nielson et al., “Statistical Guidelines for Handling Missing Data in Traumatic Brain Injury Clinical Research,” J. Neurotrauma, vol. 38, no. 18, pp. 2530–2537, 2021, doi: 10.1089/neu.2019.6702.

R. Rodríguez et al., “Water-quality Data Imputation With a High Percentage of Missing Values: a Machine Learning Approach,” Sustain., vol. 13, no. 11, pp. 1–17, 2021, doi: 10.3390/su13116318.

F. Sigrist, “A Comparison of Machine Learning Methods for Data with High-Cardinality Categorical Variables,” Digit. Gov. Res. Pract., pp. 1–8, 2023, doi: 10.48550/arXiv.2307.02071.

M. J. Zaki and M. J. Meira, Data Mining and Analysis: Fundamental Concepts and Algorithms, 2nd ed. Cambridge: Cambridge University Press, 2020.

Islamiyah, P. L. Ginting, N. Dengen, and M. Taruk, “Comparison of Priori and FP-Growth Algorithms in Determining Association Rules,” ICEEIE 2019 - Int. Conf. Electr. Electron. Inf. Eng. Emerg. Innov. Technol. Sustain. Futur., pp. 320–323, 2019, doi: 10.1109/ICEEIE47180.2019.8981438.

K. Golalipour, E. Akbari, S. S. Hamidi, M. Lee, and R. Enayatifar, “From Clustering to Clustering Ensemble Selection: a Review,” Eng. Appl. Artif. Intell., vol. 104, no. November 2020, p. 104388, 2021, doi: 10.1016/j.engappai.2021.104388.

M. Arhami and M. Nasir, Data Mining - Algoritma dan Implementasi. Yogyakarta: Andi Offset, 2020.