Explainable Machine Learning for Credit Card Default Prediction Using Web-Scraped Financial Data: A Case Study in the Peruvian Banking Sector (#451)
Read ArticleDate of Conference
December 1-3, 2025
Published In
"Entrepreneurship with Purpose: Social and Technological Innovation in the Age of AI"
Location of Conference
Cartagena
Authors
Aradiel CastaƱeda, Hilario
Mas Azahuanche, Guillermo Antonio
Mendoza Arenas, Ruben Dario
Castillo Paredes, Omar Tupac Amaru
Reinoso Palacios, Artemio Ruben
Delgado Baltazar, Marisol Paola
Mendoza Delgado, Raphael Santiago
Abstract
Credit card default represents a critical challenge for Peruvian banking due to its direct impact on the profitability and sustainability of financial institutions. In this context, this study aimed to develop an explainable machine learning-based predictive model to anticipate credit default risk using financial data obtained through web scraping from official portals of institutions such as BBVA, BCP, Interbank, and Scotiabank. The methodology involved the automated collection of monthly interest rate data by credit type and the processing of key credit variables, including credit line utilization, payment history, monthly income, and card usage frequency. Several machine learning models were trained and evaluated, with LightGBM outperforming the others by achieving an accuracy of 89.4%, a recall of 86.7%, and an area under the ROC curve of 0.94. To ensure model interpretability, SHAP (SHapley Additive exPlanations) was applied, identifying high credit usage and accumulated delinquency as the most impactful predictors. The findings suggest that the integration of explainable models can significantly enhance decision-making in credit risk management. Their adoption is recommended as a strategic support tool for real-time financial profile evaluation