ALGORITMA RANDOM FOREST UNTUK PREDIKSI STATUS PINJAMAN BERDASARKAN SKOR KREDIT
Keywords:
Class Imbalance, One-Hot Encoding, Loan Prediction, Random Forest.Abstract
The rapid development of financial technology has encouraged financial institutions to adopt data-driven credit scoring systems in order to minimize the risk of default. However, many loan eligibility prediction models still face challenges such as data imbalance (class imbalance) and the limited capability of traditional models to capture non-linear relationships among variables. This study aims to develop a loan status prediction model using the Random Forest algorithm combined with the Synthetic Minority Oversampling Technique (SMOTE) and One-Hot Encoding (OHE) to improve model accuracy and generalization capability. The data used in this study are secondary data obtained from the public Kaggle platform, consisting of 45,000 records with 14 demographic and financial attributes. The research method employs a supervised learning approach with several stages, including data acquisition and preprocessing (data cleaning, normalization, encoding, and data balancing), Random Forest model training, and performance evaluation using accuracy, precision, recall, F1-score, and AUC metrics. The results show that the combination of Random Forest, SMOTE, and OHE achieves high predictive performance, with an accuracy of 94.8%, precision of 95.6%, recall of 93.7%, F1-score of 94.6%, and an AUC value of 0.972. The most influential variables in loan status prediction are credit_score, person_income, and loan_amnt. This approach is proven to be effective in addressing data imbalance issues and improving classification accuracy in identifying creditworthy and non-creditworthy borrowers.
References
C. V. Sandeep and T. Devi, “A Novel Approach for Bank Loan Approval by Verifying Background Information of Customers through Credit Score and Analyze the Prediction Accuracy using Random Forest over Linear Regression Algorithm.,” J. Pharm. Negat. Results, vol. 13, 2022.
A. O. Kuyoro, O. A. Ogunyolu, T. G. Ayanwola, and F. Y. Ayankoya, “Dynamic Effectiveness of Random Forest Algorithm in Financial Credit Risk Management for Improving Output Accuracy and Loan Classification Prediction,” Ingenierie des Systemes d’Information, vol. 27, no. 5, pp. 815–821, Oct. 2022, doi: 10.18280/isi.270515.
B. Han, “Evaluating Machine Learning Techniques for Credit Risk Management: An Algorithmic Comparison,” Applied and Computational Engineering, vol. 112, no. 1, pp. 29–34, Nov. 2024, doi: 10.54254/2755-2721/112/20251785.
M. Madaan, A. Kumar, C. Keshri, R. Jain, and P. Nagrath, “Loan default prediction using decision trees and random forest: A comparative study,” in IOP Conference Series: Materials Science and Engineering, IOP Publishing Ltd, Jan. 2021. doi: 10.1088/1757-899X/1022/1/012042.
D. Dansana, S. G. K. Patro, B. K. Mishra, V. Prasad, A. Razak, and A. W. Wodajo, “Analyzing the impact of loan features on bank loan prediction using Random Forest algorithm,” Engineering Reports, vol. 6, no. 2, Feb. 2024, doi: 10.1002/eng2.12707.
R. Kurniawan, “Application of Random Forest Algorithm on Credit Risk Analysis,” in Procedia Computer Science, Elsevier B.V., 2024, pp. 740–749. doi: 10.1016/j.procs.2024.10.300.
V. Chang, S. Sivakulasingam, H. Wang, S. T. Wong, M. A. Ganatra, and J. Luo, “Credit Risk Prediction Using Machine Learning and Deep Learning: A Study on Credit Card Customers,” Risks, vol. 12, no. 11, Nov. 2024, doi: 10.3390/risks12110174.
L. Zeng, J. Sun, and Y. Zhou, “Auto loan default prediction based on Stacking model,” 2023, pp. 286–292. doi: 10.2991/978-94-6463-270-5_31.
N. Bussmann, P. Giudici, D. Marinelli, and J. Papenbrock, “Explainable Machine Learning in Credit Risk Management,” Comput. Econ., vol. 57, no. 1, pp. 203–216, Jan. 2021, doi: 10.1007/s10614-020-10042-0.
M. Imani, A. Beikmohammadi, and H. R. Arabnia, “Comprehensive Analysis of Random Forest and XGBoost Performance with SMOTE, ADASYN, and GNUS Under Varying Imbalance Levels,” Technologies (Basel)., vol. 13, no. 3, p. 88, 2025, doi: 10.3390/technologies13030088.
L. U. Oghenekaro and M. C. Chimela, “Design and implementation of a loan default prediction system using random forest algorithm,” Scientia Africana, vol. 22, no. 3, pp. 137–144, Jan. 2024, doi: 10.4314/sa.v22i3.12.
L. Sathish kumar, V. Pandimurugan, D. Usha, M. Nageswara Guptha, and M. S. Hema, “Random forest tree classification algorithm for predicating loan,” Mater. Today Proc., vol. 57, pp. 2216–2222, 2022, doi: https://doi.org/10.1016/j.matpr.2021.12.322.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Hadit Attaufiqqurrohman, Ade Irma Purnamasari, Denni Pratama, Nining Rahaningsih, Willy Prihartono

This work is licensed under a Creative Commons Attribution 4.0 International License.








