Model Balanced Bagging Berbasis Decision Tree Pada Dataset Imbalanced Class
DOI:
https://doi.org/10.32736/sisfokom.v12i1.1399Keywords:
Bagging, Balanced-Bagging, Imbalanced Class, Decision Tree, ClassificationAbstract
Algoritma klasifikasi merupakan algoritma yang sangat sering digunakan beriringan dengan kebutuhan manusia, namun peneliti an sebelumnya sering dijumpai kendala saat menggunakan algoritma klasifikasi. Salah satu permasalahan yang sering sekali dijumpai ialah kasus imbalanced dataset. Sehingga dalam penelitian ini diusulkan ensemble method untuk mengatasinya, salah satu algoritma ensemble method yang terkenal ialah bagging. Implementasi balanced-bagging digunakan untuk meningkatkan kemampuan dari algoritma bagging. Dalam penelitian ini melibatkan perbandingan tiga model klasifikasi berbeda dengan lima dataset yang memiliki imbalanced ratio (IR) yang berbeda, Model akan dievaluasi berdasarkan metrik akurasi (balanced accuracy), geometric mean dan area under curve (AUC). Model pertama merupakan proses klasifikasi menggunakan Decision Tree (tanpa Bagging), Model kedua merupakan proses klasifikasi menggunakan Decision Tree (dengan Bagging) dan model ketiga menggunakan Decision Tree (dengan Balanced-Bagging). Implementasi metode bagging dan balanced bagging terhadap algoritma klasifikasi Decision Tree mampu meningkatkan kinerja hasil akurasi (balanced accuracy), geometric mean, dan AUC. Secara umum model Decision Tree + Balanced Bagging menghasilkan kinerja yang terbaik pada seluruh dataset yang digunakan.References
B. Kim and J. Kim, “Adjusting decision boundary for class imbalanced learning,” IEEE Access, vol. 8, pp. 81674–81685, 2020, doi: 10.1109/ACCESS.2020.2991231.
Y. Pristyanto, “Penerapan Metode Ensemble Untuk Meningkatkan Kinerja Algoritme Klasifikasi Pada Imbalanced Dataset,” J. Teknoinfo, vol. 13, no. 1, p. 11, 2019, doi: 10.33365/jti.v13i1.184.
T. Alam, C. F. Ahmed, S. A. Zahin, M. A. H. Khan, and M. T. Islam, “An effective recursive technique for multi-class classification and regression for imbalanced data,” IEEE Access, vol. 7, pp. 127615–127630, 2019, doi: 10.1109/ACCESS.2019.2939755.
S. S. Pangastuti, K. Fithriasari, N. Iriawan, and W. Suryaningtyas, “Data Mining Approach for Educational Decision Support,” EKSAKTA J. Sci. Data Anal., vol. 2, no. February, pp. 33–44, 2021, doi: 10.20885/eksakta.vol2.iss1.art5.
L. Breimann, “Bagging predictors,” Risks, vol. 8, no. 3, pp. 1–26, 2020, doi: 10.3390/risks8030083.
J. R. Quinlan, “Bagging, Boosting, and C4.5,” pp. 725–730, 2006.
E. Yaman and A. Subasi, “Comparison of Bagging and Boosting Ensemble Machine Learning Methods for Automated EMG Signal Classification,” Biomed Res. Int., vol. 2019, 2019, doi: 10.1155/2019/9152506.
M. E. Purbaya, A. F. Nugraha, S. Gustina, and M. K. Azis, “Meta-Algorithms untuk Meningkatkan Kinerja Klasifikasi dalam Keberhasilan Telemarketing Perbankan,” Techno.Com, vol. 19, no. 4, pp. 385–396, 2020, doi: 10.33633/tc.v19i4.3725.
Y. A. Jatmiko, S. Padmadisastra, and A. Chadidjah, “Analisis Perbandingan Kinerja Cart Konvensional, Bagging Dan Random Forest Pada Klasifikasi Objek: Hasil Dari Dua Simulasi,” Media Stat., vol. 12, no. 1, p. 1, 2019, doi: 10.14710/medstat.12.1.1-12.
C. Makris, G. Pispirigos, and I. O. Rizos, “A distributed bagging ensemble methodology for community prediction in social networks,” Inf., vol. 11, no. 4, 2020, doi: 10.3390/INFO11040199.
S. Hido, H. Kashima, and Y. Takahashi, “Roughly balanced Bagging for Imbalanced data,” Stat. Anal. Data Min., vol. 2, no. 5–6, pp. 412–426, 2009, doi: 10.1002/sam.10061.
J. Błaszczyński and J. Stefanowski, “Actively balanced bagging for imbalanced data,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 10352 LNAI, pp. 271–281, 2017, doi: 10.1007/978-3-319-60438-1_27.
Sutriyono, “PENERAPAN BAGGING UNTUK PENYAKIT JANTUNG KORONER BERBASIS RANDOM FOREST,” vol. 3, pp. 536–543, 2020.
Z. Yuan and P. Zhao, “An improved ensemble learning for imbalanced data classification,” Proc. 2019 IEEE 8th Jt. Int. Inf. Technol. Artif. Intell. Conf. ITAIC 2019, no. Itaic, pp. 408–411, 2019, doi: 10.1109/ITAIC.2019.8785887.
I. D. Mienye, Y. Sun, and Z. Wang, “Prediction performance of improved decision tree-based algorithms: A review,” Procedia Manuf., vol. 35, pp. 698–703, 2019, doi: 10.1016/j.promfg.2019.06.011.
R. García Leiva, A. F. Anta, V. Mancuso, and P. Casari, “A novel hyperparameter-free approach to decision tree construction that avoids overfitting by design,” arXiv, vol. 7, 2019.
E. C. Abana, “A decision tree approach for predicting student grades in Research Project using Weka,” Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 7, pp. 285–289, 2019, doi: 10.14569/ijacsa.2019.0100739.
S. Huda et al., “An Ensemble Oversampling Model for Class Imbalance Problem in Software Defect Prediction,” IEEE Access, vol. 6, pp. 24184–24195, 2018, doi: 10.1109/ACCESS.2018.2817572.
H. Araujo, A. M. Mendonça, A. J. Pinho, and M. I. Torres, “Index of balanced accuracy: A performance measure for skewed class distributions,” Maik Nauk. Publ. / Springer SBM, vol. 5524 LNCS, pp. 441–448, 2009, doi: 10.1007/978-3-642-02172-5_57.
D. Chakrabarty, “Arithmetic-Geometric Mean: Evaluation of Parameter from Observed Data Containing Itself and Random Error,” Int. J. Electron. Appl. Res., vol. 06, no. 02, pp. 98–111, 2019, doi: 10.33665/ijear.2019.v06i02.003.
D. Brzezinski and J. Stefanowski, “Prequential AUC: properties of the area under the ROC curve for data streams with concept drift,” Knowl. Inf. Syst., vol. 52, no. 2, pp. 531–562, 2017, doi: 10.1007/s10115-017-1022-8.
Downloads
Additional Files
Published
Issue
Section
License
The copyright of the article that accepted for publication shall be assigned to Jurnal Sisfokom (Sistem Informasi dan Komputer) and LPPM ISB Atma Luhur as the publisher of the journal. Copyright includes the right to reproduce and deliver the article in all form and media, including reprints, photographs, microfilms, and any other similar reproductions, as well as translations.
Jurnal Sisfokom (Sistem Informasi dan Komputer), LPPM ISB Atma Luhur, and the Editors make every effort to ensure that no wrong or misleading data, opinions or statements be published in the journal. In any way, the contents of the articles and advertisements published in Jurnal Sisfokom (Sistem Informasi dan Komputer) are the sole and exclusive responsibility of their respective authors.
Jurnal Sisfokom (Sistem Informasi dan Komputer) has full publishing rights to the published articles. Authors are allowed to distribute articles that have been published by sharing the link or DOI of the article. Authors are allowed to use their articles for legal purposes deemed necessary without the written permission of the journal with the initial publication notification from the Jurnal Sisfokom (Sistem Informasi dan Komputer).
The Copyright Transfer Form can be downloaded [Copyright Transfer Form Jurnal Sisfokom (Sistem Informasi dan Komputer).
This agreement is to be signed by at least one of the authors who have obtained the assent of the co-author(s). After submission of this agreement signed by the corresponding author, changes of authorship or in the order of the authors listed will not be accepted. The copyright form should be signed originally, and send it to the Editorial in the form of scanned document to sisfokom@atmaluhur.ac.id.