Discovering User Sentiment Patterns in Libraries with a Hybrid Machine Learning and Lexicon-Based Approach

Dini Nurmalasari; Dini Hidayatul Qudsi; Nessa Chairani; Heri R Yuliantoro

doi:10.32736/sisfokom.v13i3.2217

Authors

Dini Nurmalasari Department of Information Technology, Computer Engineering Technology, Caltex Riau Polytechnic
Dini Hidayatul Qudsi Department of Information Technology, Computer Engineering Technology, Caltex Riau Polytechnic
Nessa Chairani Department of Information Technology, Computer Engineering Technology, Caltex Riau Polytechnic
Heri R Yuliantoro Department of Tax Accounting, Caltex Riau Polytechnic

DOI:

https://doi.org/10.32736/sisfokom.v13i3.2217

Keywords:

Sentiment Analysis, Vader Lexicon, Random Forest, Naïve Bayes, Library Opinion

Abstract

The need to enhance library services is the focus of this study, which relies on user feedback for data-driven decision-making. Text data from library user surveys conducted at Politeknik Caltex Riau (PCR) is analyzed to categorize sentiment and identify areas for improvement. The biannual student and lecturer feedback collected from 2018 to 2023 through the institution's official survey system (survey.pcr.ac.id) is utilized, providing a comprehensive and robust picture of user needs across five years. Sentiment analysis is employed using the VADER method to classify user comments into positive or negative categories. Text preprocessing techniques, such as stemming, tokenizing, and filtering, are performed to ensure robust classification. Machine learning algorithms – Naïve Bayes, Support Vector Machine (SVM), and Random Forest – are then utilized to evaluate sentiment classification accuracy. The study offers significant findings. Both SVM and Random Forest achieve an outstanding accuracy of 99%, indicating highly reliable sentiment categorization. Notably, these algorithms also achieve 100% precision, recall, and F1-score, demonstrating their effectiveness in accurately identifying positive and negative user sentiment. While Naïve Bayes shows slightly lower accuracy at 98%, it maintains a high recall rate (100%), ensuring all negative feedback is captured. This research presents a novel approach combining user sentiment analysis with a comprehensive five-year dataset. This enables a deeper understanding of evolving user needs and priorities. The high accuracy and effectiveness of the employed algorithms highlight the potential of this methodology for libraries. Libraries can leverage user feedback for evidence-based service improvement and increased user satisfaction.

References

S. Saifullah, Y. Fauziyah, and A. S. Aribowo, “Comparison of machine learning for sentiment analysis in detecting anxiety based on social media data,” J. Inform., vol. 15, no. 1, p. 45, 2021, doi: 10.26555/jifo.v15i1.a20111.

A. E. P. Nugraha, R. Riyanto, I. A. Sari, and D. P. Hadi, “Analisis Sentimen Konsumen Selama Pandemi Covid-19 di Kota Semarang Menggunakan Tableau,” J. Nusant. Apl. Manaj. Bisnis, vol. 7, no. 2, pp. 185–193, 2022, doi: 10.29407/nusamba.v7i2.16068.

D. Nurmalasari, H. Yuliantoro, and S. I. Yanti, “Dini Nurmalasari, Heri Yuliantoro, Saleha Indra Yanti,” pp. 96–105.

F. Aziz, A. R. Thaha, and N. A. Ma’ruf, “Analisis Sentimen Destinasi Wisata Geopark Ciletuh,” J. Ilm. Pariwisata, vol. 27, no. 1, p. 60, 2022, doi: 10.30647/jip.v27i1.1469.

N. GUPTA, P., TIWARI, R. & ROBERT, “Sentiment analysis and text summarization of online reviews: A survey,” in International Conference on Communication and Signal Processing, ICCSP, 2016, pp. 241–245.

F. V. S. and A. Wibowo, “Analisis Sentimen Pelanggan Toko Online Jd.Id Menggunakan Metode Naïve Bayes Classifier Berbasis Konversi Ikon Emosi,” J. SIMETRIS, vol. 10, no. 2, pp. 681–686, 2019.

R. H. Bater Makhabel, Pradeepta Mishra, Nathan Danneman, R: Mining spatial, text, web, and social media data.

S. Elbagir and J. Yang, “Language Toolkit and VADER Sentiment,” in MultiConference Eng. Computer. Sci, 2019, pp. 12–16.

C. J. and G. E. Hutto, “VADER: A Parsimonious Rule-based Model fo,” in Eighth Int. AAAI Conf. Weblogs Soc. Media, 2014, p. 18.

J. H. and E. Gilbert, “VADER: A parsimonious rulebased model for sentiment analysis of social media text,” in Proceedings of the 8th International Conference on Weblogs and Social Media, ICWSM, 2014, pp. 216–225.

Sebastian Raschka dan Vahid Mirjalili, Python Machine Learning. 2019.

A. H. Al Kabir, S. Basuki, and G. W. Wicaksono, “Analisis sentimen kritik dan saran pelatihan aplikasi teknologi informasi (PATI) menggunakan algoritma support vector machine (SVM),” J. Repos., vol. 1, no. 1, p. 39, 2019, doi: 10.22219/repositor.v1i1.11.

D. N. MI Zul, F Yulia, “Social media sentiment analysis using K-means and naïve bayes algorithm,” in 2nd International Conference on Electrical Engineering and Informatics (ICon EEI), 2019.

& R. A. Imam Riadi, Rusydi Umar, “Prediksi Kelulusan Tepat Waktu Berdasarkan Riwayat Akademik Menggunakan Metode Naïve Bayes,” Decode, vol. 4, no. 1, pp. 191–203, 2024.

A. K. Eko, E. R. D., Maharani, D., & Syahputra, “Pemanfaatan Metode Naive Bayes Untuk Klasifikasi Status Gizi Balita Pada Kelurahan Karang Anyer,” Decode, vol. 4, no. 2, pp. 392–405, 2024.

I. Afdhal, R. Kurniawan, I. Iskandar, R. Salambue, E. Budianita, and F. Syafria, “Penerapan Algoritma Random Forest Untuk Analisis Sentimen Komentar Di YouTube Tentang Islamofobia,” J. Nas. Komputasi dan Teknol. Inf., vol. 5, no. 1, pp. 122–130, 2022, [Online]. Available: http://ojs.serambimekkah.ac.id/jnkti/article/view/4004/pdf

& G. T. Cesar, W., Riki Ramdani Saputra, “Perancangan Model Sistem Pendukung Keputusan Untuk Menentukan Formasi CASN Menggunakan Naïve Bayes dan Simple Additive Weighting,” Decoding, vol. 4, no. 1, pp. 239–250, 2024.

and D. A. K. Indrayuni, A. Nurhadi, “Implementasi Algoritma Naive Bayes, Support Vector Machine, dan K_Nearest Neighbors untuk Analisa Sentimen Aplikasi Halodoc,” Fakt. Exacta, vol. 14, no. 2, p. 64, 2021.

Y. Ping, Y. Zhou, C. Xue, and Y. Yang, “Efficient representation of text with multiple perspectives,” J. China Univ. Posts Telecommun., vol. 19, no. 1, pp. 101–111, Feb. 2012, doi: 10.1016/S1005-8885(11)60234-3.

Z. Drus and H. Khalid, “Sentiment Analysis in Social Media and Its Application: Systematic Literature Review,” Procedia Comput. Sci., vol. 161, pp. 707–714, 2019, doi: https://doi.org/10.1016/j.procs.2019.11.174.

S. J and K. U, “Sentiment analysis of amazon user reviews using a hybrid approach,” Meas. Sensors, vol. 27, p. 100790, 2023, doi: https://doi.org/10.1016/j.measen.2023.100790.

V. Kumar, “Exploring the Use of Sentiment Analysis in Library User Studies: Approaches and Challenges,” 2023, pp. 446–457.

S. M. Fani, R. Santoso, and S. Suparti, “Penerapan Text Mining Untuk Melakukan Clustering Data Tweet Akun Blibli Pada Media Sosial Twitter Menggunakan K-Means Clustering,” J. Gaussian, vol. 10, no. 4, pp. 583–593, 2021, doi: 10.14710/j.gauss.v10i4.30409.

Mutammimah, H. Sujaini, and R. D. Nyoto, “Analisis Perbandingan Metode Spelling Corrector Peter Norvig dan Spelling Checker BK-Trees pada Kata Berbahasa Indonesia,” J. Sist. dan Teknol. Inf., vol. 5, no. 1, pp. 12–16, 2017.

E. Shafiera, “Pengaruh penerapan spelling correction menggunakan metode symspell pada incident categorization,” pp. 1–98, 2022, [Online]. Available: https://repository.uinjkt.ac.id/dspace/handle/123456789/65196

D. Nurmalasari and H. Ribut Yuliantoro, “Implementasi Ekstraksi Fitur untuk Pengelompokan Dokumen Proposal Menggunakan Algoritma NaÃ¯ve Bayes,” J. Komput. Terap., vol. 8, no. 1, pp. 194–203, 2022, doi: 10.35143/jkt.v8i1.5351.

M. A. N. Y. Uan, Y. U. A. N. X. I. N. O. Uyang, and Z. H. X. Iong, “A Text Categorization Method using Extended Vector Space Model by Frequent Term Sets *,” vol. 114, no. 60972145, pp. 99–114, 2013.