Qur’an Search System for Handling Cross Verse Based on Phonetic Similarity

Intan Khairunnisa Fitriani(1*), Moch Arif Bijaksana(2), Kemas Muslim Lhaksmana(3)

(1) Telkom University
(2) Telkom Univerisity
(3) Telkom Univerisity
(*) Corresponding Author

Abstract


The number of verses in the Qur'an that is not small will be difficult and time consuming if done manually. Building a search system in the Qur'anic verse using the Indonesian Arabic-Latin equivalent will be very helpful for the Muslim community in Indonesia, especially for those who are not familiar with Arabic writing. In this study, a verse search system will be built on the Al-Qur'an based on phonetic similarity, more details about the handling of the verses in the Al-Qur'an. The system was built using the Jaro-Winkler algorithm to calculate the value of similarity and using the N-Grams algorithm for ranking documents. The same study has been done before with the name Lafzi +, with MAP 90% and 93% recall. In previous studies cases such as nun wiqoyah at the end of the verse could not be handled, so the system could not handle the search for the entire Qur'an. In addition, in the previous system the application of the Jaro-Winkler method to calculate the value of similarity has also not been fully implemented. So to complete the previous research, in this study added rules other than pre-existing rules so that they can handle nun wiqoyah at the end of the verse. By applying the Jaro-Winkler method to calculate the value of similarity and N-Grams for ranking documents and adding nun wiqoyah rules, this system generates 94% MAP and 92% recall. The results of this study indicate an increase in MAP, this shows that this system can improve the accuracy of systems that have been built before.

Keywords


al-quran; cross-verses; phonetic; jaro-winkler; n-gram

Full Text:

PDF

References


D. N. Lapedes, Dictionary of Scientific and Technical Terms. New York: McGraw-Hill, 1974.

A. Syarifuddin, Mendidik anak: membaca, menulis dan mencintai Al-Quran. Jakarta: Gema Insani, 2004.

E. Rifaldi, M. A. Bijaksana, and K. M. Lhaksamana, “Sistem Pencarian Lintas Ayat Al-Qur’an Berdasarkan Kesamaan Fonetis,” Indones. J. Comput., vol. 4, no. 2, pp. 177–188, 2019.

M. A. Istiadi, “Sistem Pencarian Ayat Al-Qur’an Berbasis Kemiripan Fonetis,” Skripsi Program Sarjana, Institut Pertanian Bogor, Bogor, 2012.

W. B. Cavnar, J. M. Trenkle, and A. A. Mi, “N-Gram-Based Text Categorization,” Proc. SDAIR-94, 3rd Annu. Symp. Doc. Anal. Inf. Retr., 1994, doi: 10.1.1.53.9367.

L. Bergroth, H. Hakonen, and T. Raita, “A survey of longest common subsequence algorithms,” 2000, doi: 10.1109/SPIRE.2000.878178.

D. S. Hirschberg, “Algorithms for the longest common subsequence problem,” J. ACM, vol. 24, no. 4, pp. 664–675, 1977.

M. Rajabzadeh, S. Tabibian, A. Akbari, and B. Nasersharif, “Improved dynamic match phone lattice search using Viterbi scores and Jaro Winkler distance for keyword spotting system,” in The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012), 2012, pp. 423–427.

F. Friendly, “PERBAIKAN METODE JARO--WINKLER DISTANCE UNTUK APPROXIMATE STRING SEARCH MENGGUNAKAN DATA TERINDEKS APLIKASI MULTI USER,” J. Teknovasi J. Tek. dan Inov., vol. 4, no. 2, pp. 69–78, 2018.

A. Nwesri, “Effective retrieval techniques for Arabic text,” 2008.




DOI: https://doi.org/10.32736/sisfokom.v10i1.986

Refbacks

  • There are currently no refbacks.



Indexed By:

 



Creative Commons License
Jurnal Sisfokom (Sistem Informasi dan Komputer) has ISSN 2301-7988 and e-ISSN 2581-0588 which is published by Lembaga Penelitian dan Pengabdian Masyarakat (LPPM) ISB Atma Luhur under a Creative Commons Attribution-ShareAlike 4.0 International License.
Web Analytics Made Easy - StatCounter