Diagnosing the performance of machine learning models for phishing website detection: A literature review. (#274)
Read ArticleDate of Conference
July 16-18, 2025
Published In
"Engineering, Artificial Intelligence, and Sustainable Technologies in service of society"
Location of Conference
Mexico
Authors
Santa Cruz-Rufasto, Frank Luis
Dios-Castillo, Christian Abraham
Abstract
Detecting phishing websites using Machine Learning (ML) techniques is a key approach in modern cybersecurity, with models such as Random Forest reaching accuracy levels close to 99%, followed by Support Vector Machine, Decision Tree and Logistic Regression. However, what is the level of accuracy of ML techniques in this task and what are the key factors affecting their accuracy and effectiveness? The results highlight that the quality and diversity of the training data, together with metrics such as Accuracy, Precision and Recall, are determinants in the performance of the models. In addition, the ability of algorithms to adapt to dynamic attack patterns is crucial. This study, based on a systematic review with the PRISMA statement, analyzed 43 articles selected from more than 4,600 initials, revealing the importance of developing computationally efficient methods that maintain high levels of accuracy to address growing digital threats.