Qur’an Search System for Handling Cross Verse Based on Phonetic Similarity

Authors

  • Intan Khairunnisa Fitriani Telkom University
  • Moch Arif Bijaksana Telkom Univerisity
  • Kemas Muslim Lhaksmana Telkom Univerisity

DOI:

https://doi.org/10.32736/sisfokom.v10i1.986

Keywords:

al-quran, cross-verses, phonetic, jaro-winkler, n-gram

Abstract

The number of verses in the Qur'an that is not small will be difficult and time consuming if done manually. Building a search system in the Qur'anic verse using the Indonesian Arabic-Latin equivalent will be very helpful for the Muslim community in Indonesia, especially for those who are not familiar with Arabic writing. In this study, a verse search system will be built on the Al-Qur'an based on phonetic similarity, more details about the handling of the verses in the Al-Qur'an. The system was built using the Jaro-Winkler algorithm to calculate the value of similarity and using the N-Grams algorithm for ranking documents. The same study has been done before with the name Lafzi +, with MAP 90% and 93% recall. In previous studies cases such as nun wiqoyah at the end of the verse could not be handled, so the system could not handle the search for the entire Qur'an. In addition, in the previous system the application of the Jaro-Winkler method to calculate the value of similarity has also not been fully implemented. So to complete the previous research, in this study added rules other than pre-existing rules so that they can handle nun wiqoyah at the end of the verse. By applying the Jaro-Winkler method to calculate the value of similarity and N-Grams for ranking documents and adding nun wiqoyah rules, this system generates 94% MAP and 92% recall. The results of this study indicate an increase in MAP, this shows that this system can improve the accuracy of systems that have been built before.

References

D. N. Lapedes, Dictionary of Scientific and Technical Terms. New York: McGraw-Hill, 1974.

A. Syarifuddin, Mendidik anak: membaca, menulis dan mencintai Al-Quran. Jakarta: Gema Insani, 2004.

E. Rifaldi, M. A. Bijaksana, and K. M. Lhaksamana, “Sistem Pencarian Lintas Ayat Al-Qur’an Berdasarkan Kesamaan Fonetis,” Indones. J. Comput., vol. 4, no. 2, pp. 177–188, 2019.

M. A. Istiadi, “Sistem Pencarian Ayat Al-Qur’an Berbasis Kemiripan Fonetis,” Skripsi Program Sarjana, Institut Pertanian Bogor, Bogor, 2012.

W. B. Cavnar, J. M. Trenkle, and A. A. Mi, “N-Gram-Based Text Categorization,” Proc. SDAIR-94, 3rd Annu. Symp. Doc. Anal. Inf. Retr., 1994, doi: 10.1.1.53.9367.

L. Bergroth, H. Hakonen, and T. Raita, “A survey of longest common subsequence algorithms,” 2000, doi: 10.1109/SPIRE.2000.878178.

D. S. Hirschberg, “Algorithms for the longest common subsequence problem,” J. ACM, vol. 24, no. 4, pp. 664–675, 1977.

M. Rajabzadeh, S. Tabibian, A. Akbari, and B. Nasersharif, “Improved dynamic match phone lattice search using Viterbi scores and Jaro Winkler distance for keyword spotting system,” in The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012), 2012, pp. 423–427.

F. Friendly, “PERBAIKAN METODE JARO--WINKLER DISTANCE UNTUK APPROXIMATE STRING SEARCH MENGGUNAKAN DATA TERINDEKS APLIKASI MULTI USER,” J. Teknovasi J. Tek. dan Inov., vol. 4, no. 2, pp. 69–78, 2018.

A. Nwesri, “Effective retrieval techniques for Arabic text,” 2008.

Downloads

Published

2021-02-02

Issue

Section

Articles