Peningkatan Performa Klasifikasi Machine Learning Melalui Perbandingan Metode Machine Learning dan Peningkatan Dataset
DOI:
https://doi.org/10.32736/sisfokom.v11i1.1337Keywords:
Classification, StringToWordVector, Machine Learning, Exam ClassificationAbstract
Classification using machine learning is an alternative that is widely used to classify data. There are various classification methods or also known as machine learning classification algorithms that can be used. However, to get the best classification results, we need a classifier that fits the dataset type to provide the best classification performance. In addition, the quality and quantity of data contained in a dataset also has an influence on the classification performance. In this study, several attempts were made to improve the classification performance of the dataset of Indonesian language exam questions at the elementary school level based on the category of difficulty level. The efforts made consist of improving the quality of the dataset and using the StringToWordVector filter algorithm to manage textual data, as well as the use of several classification algorithms such as the nave Bayes algorithm, Random Forest, and REPTree. Classification is done by using WEKA Tools. The results of the experiments carried out showed the highest performance increase of 15% after improving the quality of the dataset and using the right classification method.References
B. Mahesh, “Machine Learning Algorithms-A Review,” International Journal of Science and Research (IJSR).[Internet], vol. 9, pp. 381–386, 2020.
T. P. Carvalho, F. A. Soares, R. Vita, R. da P. Francisco, J. P. Basto, and S. G. S. Alcalá, “A systematic literature review of machine learning methods applied to predictive maintenance,” Computers & Industrial Engineering, vol. 137, p. 106024, 2019.
A. Althnian et al., “Impact of Dataset Size on Classification Performance: An Empirical Evaluation in the Medical Domain,” Applied Sciences, vol. 11, no. 2, p. 796, 2021.
G. Zayaraz, “Concept relation extraction using Naïve Bayes classifier for ontology-based question answering systems,” Journal of King Saud University-Computer and Information Sciences, vol. 27, no. 1, pp. 13–24, 2015.
T. GopalaKrishnan and P. Sengottuvelan, “A hybrid PSO with Naïve Bayes classifier for disengagement detection in online learning,” Program, 2016.
W. G. Touw et al., “Data mining in the Life Sciences with Random Forest: a walk in the park or lost in the jungle?,” Briefings in bioinformatics, vol. 14, no. 3, pp. 315–326, 2013.
A. Verikas, A. Gelzinis, and M. Bacauskiene, “Mining data with random forests: A survey and results of new tests,” Pattern recognition, vol. 44, no. 2, pp. 330–349, 2011.
D. Denisko and M. M. Hoffman, “Classification and interaction in random forests,” Proceedings of the National Academy of Sciences, vol. 115, no. 8, pp. 1690–1692, 2018.
“How To Increase Accuracy Of Machine Learning Model.” https://www.analyticsvidhya.com/blog/2015/12/improve-machine-learning-results/ (accessed Dec. 19, 2021).
M. Mohamad, A. Selamat, I. M. Subroto, and O. Krejcar, “Improving the classification performance on imbalanced data sets via new hybrid parameterisation model,” Journal of King Saud University-Computer and Information Sciences, vol. 33, no. 7, pp. 787–797, 2021.
J. Gong and H. Kim, “RHSBoost: Improving classification performance in imbalance data,” Computational Statistics & Data Analysis, vol. 111, pp. 1–13, 2017.
P. Pujari and J. B. Gupta, “Improving classification accuracy by using feature selection and ensemble model,” International Journal of Soft Computing and Engineering (IJSCE), vol. 2, no. 2, pp. 380–386, 2012.
J.-S. Chou, M.-Y. Cheng, and Y.-W. Wu, “Improving classification accuracy of project dispute resolution using hybrid artificial intelligence and support vector machine models,” Expert Systems with Applications, vol. 40, no. 6, pp. 2263–2274, 2013.
W.-W. Wu, “Improving classification accuracy and causal knowledge for better credit decisions,” International Journal of Neural Systems, vol. 21, no. 04, pp. 297–309, 2011.
M. Mowafy, A. Rezk, and H. El-Bakry, “An efficient classification model for unstructured text document,” American Journal of Computer Science and Information Technology, vol. 6, no. 1, p. 16, 2018.
S. K. Trivedi and P. K. Panigrahi, “Spam classification: a comparative analysis of different boosted decision tree approaches,” Journal of Systems and Information Technology, 2018.
H. Naji and W. Ashour, “Text Classification for Arabic Words Using Rep-Tree,” International Journal of Computer Science & Information Technology (IJCSIT) Vol, vol. 8, 2016.
F. Baharuddin and A. Tjahyanto, “Dataset Soal Ujian Bahasa Indonesia Tingkat Sekolah Dasar,” Dec. 2021, doi: 10.5281/ZENODO.5793377.
Downloads
Additional Files
Published
Issue
Section
License
The copyright of the article that accepted for publication shall be assigned to Jurnal Sisfokom (Sistem Informasi dan Komputer) and LPPM ISB Atma Luhur as the publisher of the journal. Copyright includes the right to reproduce and deliver the article in all form and media, including reprints, photographs, microfilms, and any other similar reproductions, as well as translations.
Jurnal Sisfokom (Sistem Informasi dan Komputer), LPPM ISB Atma Luhur, and the Editors make every effort to ensure that no wrong or misleading data, opinions or statements be published in the journal. In any way, the contents of the articles and advertisements published in Jurnal Sisfokom (Sistem Informasi dan Komputer) are the sole and exclusive responsibility of their respective authors.
Jurnal Sisfokom (Sistem Informasi dan Komputer) has full publishing rights to the published articles. Authors are allowed to distribute articles that have been published by sharing the link or DOI of the article. Authors are allowed to use their articles for legal purposes deemed necessary without the written permission of the journal with the initial publication notification from the Jurnal Sisfokom (Sistem Informasi dan Komputer).
The Copyright Transfer Form can be downloaded [Copyright Transfer Form Jurnal Sisfokom (Sistem Informasi dan Komputer).
This agreement is to be signed by at least one of the authors who have obtained the assent of the co-author(s). After submission of this agreement signed by the corresponding author, changes of authorship or in the order of the authors listed will not be accepted. The copyright form should be signed originally, and send it to the Editorial in the form of scanned document to sisfokom@atmaluhur.ac.id.