Diagnosing the performance of machine learning models for phishing website detection: A literature review.

Santa Cruz-Rufasto, Frank Luis; Dios-Castillo, Christian Abraham

Diagnosing the performance of machine learning models for phishing website detection: A literature review. (#274)

Read Article

Date of Conference

July 16-18, 2025

Published In

"Engineering, Artificial Intelligence, and Sustainable Technologies in service of society"

Location of Conference

Mexico

Authors

Santa Cruz-Rufasto, Frank Luis

Dios-Castillo, Christian Abraham

Abstract

Detecting phishing websites using Machine Learning (ML) techniques is a key approach in modern cybersecurity, with models such as Random Forest reaching accuracy levels close to 99%, followed by Support Vector Machine, Decision Tree and Logistic Regression. However, what is the level of accuracy of ML techniques in this task and what are the key factors affecting their accuracy and effectiveness? The results highlight that the quality and diversity of the training data, together with metrics such as Accuracy, Precision and Recall, are determinants in the performance of the models. In addition, the ability of algorithms to adapt to dynamic attack patterns is crucial. This study, based on a systematic review with the PRISMA statement, analyzed 43 articles selected from more than 4,600 initials, revealing the importance of developing computationally efficient methods that maintain high levels of accuracy to address growing digital threats.

Read Article