Predictive Accuracy in the Detection of Breast Cancer through Machine Learning (#2051)
Read ArticleDate of Conference
July 16-18, 2025
Published In
"Engineering, Artificial Intelligence, and Sustainable Technologies in service of society"
Location of Conference
Mexico
Authors
Patiño-Pérez, Darwin
Burgos-Robalino, Freddy
Reyes-Sánchez, Zynnia
Ramírez-Hecksher, Ana
Munive-Mora, Celia
Abstract
Breast cancer is one of the most significant health problems worldwide, and its early detection is crucial to improve clinical outcomes for patients. In this context, machine learning models have become valuable tools to predict the presence of this disease with greater precision. This study performs a comparative analysis of three machine learning models: logistic regression, random forest, and support vector machines (SVM), using the Wisconsin Breast Cancer Diagnosis dataset. This data set includes features derived from fine-needle aspiration images of breast masses, with 357 benign and 212 malignant cases. The results of the study reveal that the random forest model outperforms the other two in terms of predictive accuracy. This model, which uses the top 5 predictors ("concave point mean", "area mean", "radius mean", "perimeter mean", and "concavity mean"), achieves an accuracy of about 94.15% and a cross-validation score of about 95.61% on the test data set. These findings highlight the effectiveness of random forest in identifying complex patterns in data, making it a promising tool for breast cancer prediction. In conclusion, this study demonstrates the potential of machine learning models, particularly random forest, in improving early detection of breast cancer. These advances could have a significant impact on clinical practice, facilitating more accurate and timely diagnoses, which in turn could improve patient outcomes and reduce mortality associated with this disease.