Comparative Analysis: Machine Learning Algorithms for TOC Prediction in Pharmaceutical Water Treatment Systems
DOI:
https://doi.org/10.32736/sisfokom.v13i2.2148Keywords:
Machine Learning, Total Organic Carbon, Pharmaceutical Water Treatment Systems, Algorithm Comparison, Water Quality AssessmentAbstract
Water quality is crucial in pharmaceutical production, where it serves as a solvent and raw material. Contamination with organic compounds poses a risk to product integrity and safety. TOC serves as a key indicator for assessing organic pollution levels in water. An increase in TOC signals potential issues with water treatment systems. Machine learning prediction of TOC values is essential for preemptive monitoring and maintenance. This study aimed to compare three different machine learning algorithms - Linear Regression (RL), Random Forest (RF), and multilayer perceptron (MLP) - for predicting Total Organic Carbon (TOC) in pharmaceutical water treatment systems. By utilizing a dataset covering various operational conditions of pharmaceutical water treatment systems, the research conducted a comprehensive analysis. Each algorithm underwent evaluation using performance metrics like coefficient of determination (R-squared), and prediction accuracy to assess their effectiveness in predicting TOC concentrations. A correlation coefficient approaching 1 (100%) signifies a strong relationship between model predictions and actual target values (accuracy prediction), while a smaller Mean Absolute Error (MAE) indicates higher accuracy in predicting target values. The study found that the results of the correlation coefficient in order from highest to lowest are the RF, MLP, and RL models with values of 95.04%, 93.11%, and 80.27%, respectively. Likewise, additional metrics for evaluation, including Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Relative Absolute Error (RAE) and Root Relative Squared Error (RRSE), exhibit a ranking from lowest to highest values across RF, MLP, and RL models. RF has a higher prediction accuracy of the TOC than other models (95%) and lowest MAE (3.9). This research offers valuable insights into utilizing machine learning algorithms for TOC prediction within pharmaceutical water treatment to make informed decisions, improving water treatment systems and overall product quality.References
T. Sandle, “Chapter 14 - Assessment of pharmaceutical water systems,” in Biocontamination Control for Pharmaceuticals and Healthcare (Second Edition), T. Sandle, Ed., Academic Press, 2024, pp. 313–327. doi: https://doi.org/10.1016/B978-0-443-21600-8.00014-2.
F. Roeder and T. Sandle, “Microbial Contamination in Water Systems,” PDA J Pharm Sci Technol, p. pdajpst.2021.012636, Jan. 2022, doi: 10.5731/pdajpst.2021.012636.
H.-S. Lee, J. Hur, and H.-S. Shin, “Enhancing the total organic carbon measurement efficiency for water samples containing suspended solids using alkaline and ultrasonic pretreatment methods,” Journal of Environmental Sciences, vol. 90, pp. 20–28, 2020, doi: https://doi.org/10.1016/j.jes.2019.11.010.
A. Shetty and A. Goyal, “Total organic carbon analysis in water – A review of current methods,” Mater Today Proc, vol. 65, pp. 3881–3886, 2022, doi: https://doi.org/10.1016/j.matpr.2022.07.173.
Y. Huang, L. Zhang, and L. Ran, “Total Organic Carbon Concentration and Export in a Human-Dominated Urban River: A Case Study in the Shenzhen River and Bay Basin,” Water (Basel), vol. 14, no. 13, 2022, doi: 10.3390/w14132102.
L. Zhu, X. Zhou, W. Liu, and Z. Kong, “Total organic carbon content logging prediction based on machine learning: A brief review,” Energy Geoscience, vol. 4, no. 2, p. 100098, 2023.
L. Goliatt, C. M. Saporetti, and E. Pereira, “Super learner approach to predict total organic carbon using stacking machine learning models based on well logs,” Fuel, vol. 353, p. 128682, 2023, doi: https://doi.org/10.1016/j.fuel.2023.128682.
J. T. Lingkungan, M. Haekal, and W. C. Wibowo, “Prediksi Kualitas Air Sungai Menggunakan Metode Pembelajaran Mesin: Studi Kasus Sungai Ciliwung Prediction of River Water Quality Using Machine Learning Methods: Ciliwung River Case Study,” vol. 24, no. 2, pp. 273–282, 2023.
J. Wang, D. Gu, W. Guo, H. Zhang, and D. Yang, “Determination of Total Organic Carbon Content in Shale Formations With Regression Analysis,” J Energy Resour Technol, vol. 141, no. 1, Jan. 2019, doi: 10.1115/1.4040755.
R. C. Wibowo, O. Dewanto, and M. Sarkowi, “Total organic carbon (TOC) prediction using machine learning methods based on well logs data,” in AIP Conference Proceedings, AIP Publishing, 2022.
R. C. Wibowo, O. Dewanto, and M. Sarkowi, “Total Organic Carbon (TOC) Prediction Using Machine Learning Methods Based on Well Logs Data,” in AIP Conference Proceedings, American Institute of Physics Inc., Oct. 2022. doi: 10.1063/5.0103209.
J. L. Speiser, M. E. Miller, J. Tooze, and E. Ip, “A comparison of random forest variable selection methods for classification prediction modeling,” Expert Syst Appl, vol. 134, pp. 93–101, 2019, doi: https://doi.org/10.1016/j.eswa.2019.05.028.
A. M. Handhal, A. M. Al-Abadi, H. E. Chafeet, and M. J. Ismail, “Prediction of total organic carbon at Rumaila oil field, Southern Iraq using conventional well logs and machine learning algorithms,” Mar Pet Geol, vol. 116, p. 104347, 2020, doi: https://doi.org/10.1016/j.marpetgeo.2020.104347.
D. C. Montgomery, E. A. Peck, and G. G. Vining, Introduction to linear regression analysis. John Wiley & Sons, 2021.
D. Maulud and A. M. Abdulazeez, “A Review on Linear Regression Comprehensive in Machine Learning,” Journal of Applied Science and Technology Trends, vol. 1, no. 2, pp. 140–147, Dec. 2020, doi: 10.38094/jastt1457.
S. Badillo et al., “An Introduction to Machine Learning,” Clin Pharmacol Ther, vol. 107, no. 4, pp. 871–885, Apr. 2020, doi: https://doi.org/10.1002/cpt.1796.
M. Schonlau and R. Y. Zou, “The random forest algorithm for statistical learning,” Stata J, vol. 20, no. 1, pp. 3–29, Mar. 2020, doi: 10.1177/1536867X20909688.
S. S. Azmi and S. Baliga, “An overview of boosting decision tree algorithms utilizing AdaBoost and XGBoost boosting strategies,” Int. Res. J. Eng. Technol, vol. 7, no. 5, pp. 6867–6870, 2020.
S. Nosratabadi, S. Ardabili, Z. Lakner, C. Mako, and A. Mosavi, “Prediction of food production using machine learning algorithms of multilayer perceptron and ANFIS,” Agriculture, vol. 11, no. 5, p. 408, 2021.
F. Yang, H. Moayedi, and A. Mosavi, “Predicting the degree of dissolved oxygen using three types of multi-layer perceptron-based artificial neural networks,” Sustainability, vol. 13, no. 17, p. 9898, 2021.
E. R. AlBasiouny, A.-F. A. Heliel, H. E. Abdelmunim, and H. M. Abbas, “Multilayer Perceptron Generative Model via Adversarial Learning for Robust Visual Tracking,” IEEE Access, vol. 10, pp. 121230–121248, 2022, doi: 10.1109/ACCESS.2022.3222867.
M. I. C. Rachmatullah, J. Santoso, and K. Surendro, “A Novel Approach in Determining Neural Networks Architecture to Classify Data With Large Number of Attributes,” IEEE Access, vol. 8, pp. 204728–204743, 2020, doi: 10.1109/ACCESS.2020.3036853.
J. Pavic, “An Introduction to WEKA: The All-in-One Machine Learning Software in Java”.
A. Sadiq, “Intrusion Detection Using the WEKA Machine Learning Tool,” 2021.
B. Saleh, A. Saedi, A. al-Aqbi, and L. Salman, “Analysis of Weka Data Mining Techniques for Heart Disease Prediction System,” International Journal of Medical Reviews, vol. 7, no. 1, pp. 15–24, 2020, doi: 10.30491/ijmr.2020.221474.1078.
S. F. Mohd Radzi, M. S. Hassan, and M. A. H. Mohd Radzi, “Comparison of classification algorithms for predicting autistic spectrum disorder using WEKA modeler,” BMC Med Inform Decis Mak, vol. 22, no. 1, p. 306, 2022, doi: 10.1186/s12911-022-02050-x.
V. Da Poian et al., “Exploratory data analysis (EDA) machine learning approaches for ocean world analog mass spectrometry,” Frontiers in Astronomy and Space Sciences, vol. 10, p. 1134141, 2023.
D. T. H. S. Tariq and P. S. Aithal, “Visualization and Explorative Data Analysis,” Int J Enhanc Res Sci Technol Eng, vol. 12, no. 3, pp. 11–21, 2023.
Engr. Dr. F. Obodoeze, C. Nwabueze, and S. Akaneme, “Comparative Evaluation of Machine Learning Regression Algorithms for PM2.5 Monitoring,” American Journal of Engineering Research, vol. 10, pp. 19–33, Dec. 2021.
J. Rong et al., “Machine Learning Method for TOC Prediction: Taking Wufeng and Longmaxi Shales in the Sichuan Basin, Southwest China as an Example,” Geofluids, vol. 2021, 2021, doi: 10.1155/2021/6794213.
A. Apaza-Pinto, J. Esquicha-Tejada, P. López-Casaperalta, and J. Sulla-Torres, “Supervised Machine Learning Techniques for the Prediction of the State of Charge of Batteries in Photovoltaic Systems in the Mining Sector,” IEEE Access, vol. 10, pp. 134307–134317, 2022, doi: 10.1109/ACCESS.2022.3225406.
Downloads
Published
Issue
Section
License
The copyright of the article that accepted for publication shall be assigned to Jurnal Sisfokom (Sistem Informasi dan Komputer) and LPPM ISB Atma Luhur as the publisher of the journal. Copyright includes the right to reproduce and deliver the article in all form and media, including reprints, photographs, microfilms, and any other similar reproductions, as well as translations.
Jurnal Sisfokom (Sistem Informasi dan Komputer), LPPM ISB Atma Luhur, and the Editors make every effort to ensure that no wrong or misleading data, opinions or statements be published in the journal. In any way, the contents of the articles and advertisements published in Jurnal Sisfokom (Sistem Informasi dan Komputer) are the sole and exclusive responsibility of their respective authors.
Jurnal Sisfokom (Sistem Informasi dan Komputer) has full publishing rights to the published articles. Authors are allowed to distribute articles that have been published by sharing the link or DOI of the article. Authors are allowed to use their articles for legal purposes deemed necessary without the written permission of the journal with the initial publication notification from the Jurnal Sisfokom (Sistem Informasi dan Komputer).
The Copyright Transfer Form can be downloaded [Copyright Transfer Form Jurnal Sisfokom (Sistem Informasi dan Komputer).
This agreement is to be signed by at least one of the authors who have obtained the assent of the co-author(s). After submission of this agreement signed by the corresponding author, changes of authorship or in the order of the authors listed will not be accepted. The copyright form should be signed originally, and send it to the Editorial in the form of scanned document to sisfokom@atmaluhur.ac.id.