Clustering OKU Timur Script Images using VGG Feature extraction and K-Means
DOI:
https://doi.org/10.32736/sisfokom.v14i1.2292Keywords:
OKU Timur Script, VGG16 Model, Clustering, K-Means, Manuscript ImagesAbstract
This study focuses on the utilization of clustering models to group manuscript images from the OKU Timur region based on specific characteristics. OKU Timur is rich in cultural heritage, including a unique writing system known as the OKU Timur script. The development of intelligent systems technology can be employed to recognize the OKU Timur script. For this purpose, a dataset of OKU Timur script is needed, which will later be used for classifying script images. One of the challenges in preparing the dataset is grouping a large number of script image samples according to the number of characters. A proposed solution in this research is to automatically group script images by applying the K-Means algorithm. The dataset comprises 2,280 images, representing 19 characters and 228 variations with different diacritics. Features are extracted using the VGG16 model, which are then clustered with the K-Means algorithm. Clustering performance is evaluated based on the percentage of correctly grouped characters. For 19 groups (character count), the model achieves an accuracy of 82.6%. For 228 groups (variations and diacritics), it correctly groups 48.16% of characters. Despite the challenges, the results demonstrate the model’s potential for further refinement. This study’s contribution lies in introducing an efficient clustering approach for cultural manuscripts, supporting digital preservation, and advancing automatic recognition of the OKU Timur script. These efforts aim to preserve the script for future generations.References
E. E. Panjaitan and N. Siregar, “THE IMPORTANCE OF LEARNING INDONESIAN LANGUAGE IN PRIMARY SCHOOL,” Ontol. J. PEMBELAJARAN DAN Ilm. Pendidik., vol. 2, no. 1, pp. 37–46, 2024.
E. Roza, “Aksara Arab-Melayu di Nusantara dan Sumbangsihnya dalam Pengembangan Khazanah Intelektual,” Tsaqafah, vol. 13, no. 1, pp. 177–204, 2017.
G. Aceto, V. Persico, and A. Pescapé, “A survey on information and communication technologies for industry 4.0: State-of-the-art, taxonomies, perspectives, and challenges,” IEEE Commun. Surv. Tutor., vol. 21, no. 4, pp. 3467–3501, 2019.
A. Drajat, E. W. Harahap, and others, “Rajah dan Spiritualitas Lokal dalam Hukum Islam; Studi Analisis Tafsir Hermeneutik,” Jurisprudensi J. Ilmu Syariah Perundang-Undangan Dan Ekon. Islam, vol. 16, no. 1, pp. 225–240, 2024.
L. Johanson, “The history of Turkic,” in The Turkic Languages, Routledge, 2021, pp. 83–120.
C. Agus, S. R. Saktimulya, P. Dwiarso, B. Widodo, S. Rochmiyati, and M. Darmowiyono, “Revitalization of local traditional culture for sustainable development of national character building in Indonesia,” Innov. Tradit. Sustain. Dev., pp. 347–369, 2021.
D. Iskandar, S. Hidayat, U. Jamaludin, and S. M. Leksono, “Javanese script digitalization and its utilization as learning media: an etnopedagogical approach,” Int. J. Math. Sci. Educ., vol. 1, no. 1, pp. 21–30, 2023.
I. Siregar, “Papuan Tabla Language Preservation Strategy,” LingLit J. Sci. J. Linguist. Lit., vol. 3, no. 1, pp. 1–12, 2022.
Y. N. Kunang, I. Z. Yadi, Mahmud, and M. Husin, “A New Deep Learning-Based Mobile Application for Komering Character Recognition,” in 2022 5th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), Yogyakarta, Indonesia: IEEE, Dec. 2022, pp. 294–299. doi: 10.1109/ISRITI56927.2022.10053072.
T. P. Sari and Y. N. Kunang, “Pengembangan Aplikasi Transliterasi Teks Latin ke Aksara Ulu (Komering) Berbasis Web,” J. Process., vol. 18, no. 2, 2023.
S. Huang, H. Wang, Y. Liu, X. Shi, and L. Jin, “OBC306: A large-scale oracle bone character recognition dataset,” in 2019 International Conference on Document Analysis and Recognition (ICDAR), IEEE, 2019, pp. 681–688.
R. Deng et al., “Automatic Identification of Sea Rice Grains in Complex Field Environment Based on Deep Learning,” Agriculture, vol. 14, no. 7, p. 1135, 2024.
A. E. Ezugwu, A. K. Shukla, M. B. Agbaje, O. N. Oyelade, A. José-García, and J. O. Agushaka, “Automatic clustering algorithms: a systematic review and bibliometric analysis of relevant literature,” Neural Comput. Appl., vol. 33, pp. 6247–6306, 2021.
I. Chatterjee, M. Ghosh, P. K. Singh, R. Sarkar, and M. Nasipuri, “A clustering-based feature selection framework for handwritten Indic script classification,” Expert Syst., vol. 36, no. 6, p. e12459, 2019.
A. R. Widiarti, G. R. Prima, and C. K. Adi, “Preliminary research for provision of Javanese script image dataset from Javanese script printed book,” in AIP Conference Proceedings, AIP Publishing, 2024.
J. Oyelade et al., “Data clustering: Algorithms and its applications,” in 2019 19th international conference on computational science and its applications (ICCSA), IEEE, 2019, pp. 71–81.
S. Setyaningtyas, B. I. Nugroho, and Z. Arif, “Tinjauan Pustaka Sistematis: Penerapan Data Mining Teknik Clustering Algoritma K-Means,” J. Teknoif Tek. Inform. Inst. Teknol. Padang, vol. 10, no. 2, pp. 52–61, 2022.
A. Ghosal, A. Nandy, A. K. Das, S. Goswami, and M. Panday, “A short review on different clustering techniques and their applications,” Emerg. Technol. Model. Graph. Proc. IEM Graph 2018, pp. 69–83, 2020.
S. Panda, M. Nayak, and A. K. Nayak, “Clustering of Odia character images using K-means algorithm and spectral clustering algorithm,” in ICICCT 2019–System Reliability, Quality Control, Safety, Maintenance and Management: Applications to Electrical, Electronics and Computer Science and Engineering, Springer, 2020, pp. 55–64.
A. R. Widiarti and C. K. Adi, “Clustering Balinese Script Image in Palm Leaf Using Hierarchical K-Means Algorithm,” in International Conference on Innovation in Science and Technology (ICIST 2020), Atlantis Press, 2021, pp. 38–42.
A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, “K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data,” Inf. Sci., vol. 622, pp. 178–210, 2023.
N. H. Shrifan, M. F. Akbar, and N. A. M. Isa, “An adaptive outlier removal aided k-means clustering algorithm,” J. King Saud Univ.-Comput. Inf. Sci., vol. 34, no. 8, pp. 6365–6376, 2022.
A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, “K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data,” Inf. Sci., vol. 622, pp. 178–210, 2023.
M. Ahmed, R. Seraj, and S. M. S. Islam, “The k-means algorithm: A comprehensive survey and performance evaluation,” Electronics, vol. 9, no. 8, p. 1295, 2020.
S. Sen, P. Chakraborty, S. Das, K. Pandey, and P. Narayana, “Investigation of Clustering Methods for SDSS Galaxy Images through Feature Extraction with VGG-16,” in 2024 IEEE Space, Aerospace and Defence Conference (SPACE), IEEE, 2024, pp. 660–664.
S. Tammina, “Transfer learning using vgg-16 with deep convolutional neural network for classifying images,” Int. J. Sci. Res. Publ. IJSRP, vol. 9, no. 10, pp. 143–150, 2019.
Y. Ren et al., “Deep clustering: A comprehensive survey,” IEEE Trans. Neural Netw. Learn. Syst., 2024.
I. Ioannou, C. Christophorou, P. Nagaradjane, and V. Vassiliou, “Performance Evaluation of Machine Learning Cluster Metrics for Mobile Network Augmentation,” in 2024 International Conference on Wireless Communications Signal Processing and Networking (WiSPNET), IEEE, 2024, pp. 1–7.
W. Bao, N. Lianju, and K. Yue, “Integration of unsupervised and supervised machine learning algorithms for credit risk assessment,” Expert Syst. Appl., vol. 128, pp. 301–315, 2019.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Jurnal Sisfokom (Sistem Informasi dan Komputer)

This work is licensed under a Creative Commons Attribution 4.0 International License.
The copyright of the article that accepted for publication shall be assigned to Jurnal Sisfokom (Sistem Informasi dan Komputer) and LPPM ISB Atma Luhur as the publisher of the journal. Copyright includes the right to reproduce and deliver the article in all form and media, including reprints, photographs, microfilms, and any other similar reproductions, as well as translations.
Jurnal Sisfokom (Sistem Informasi dan Komputer), LPPM ISB Atma Luhur, and the Editors make every effort to ensure that no wrong or misleading data, opinions or statements be published in the journal. In any way, the contents of the articles and advertisements published in Jurnal Sisfokom (Sistem Informasi dan Komputer) are the sole and exclusive responsibility of their respective authors.
Jurnal Sisfokom (Sistem Informasi dan Komputer) has full publishing rights to the published articles. Authors are allowed to distribute articles that have been published by sharing the link or DOI of the article. Authors are allowed to use their articles for legal purposes deemed necessary without the written permission of the journal with the initial publication notification from the Jurnal Sisfokom (Sistem Informasi dan Komputer).
The Copyright Transfer Form can be downloaded [Copyright Transfer Form Jurnal Sisfokom (Sistem Informasi dan Komputer).
This agreement is to be signed by at least one of the authors who have obtained the assent of the co-author(s). After submission of this agreement signed by the corresponding author, changes of authorship or in the order of the authors listed will not be accepted. The copyright form should be signed originally, and send it to the Editorial in the form of scanned document to sisfokom@atmaluhur.ac.id.