<< Back

Exploring Scientific Discourses on Women in Engineering and Sustainability through Web Scraping and LDA Analysis with R (2020–2025) (#844)

Read Article

Date of Conference

December 1-3, 2025

Published In

"Entrepreneurship with Purpose: Social and Technological Innovation in the Age of AI"

Location of Conference

Cartagena

Authors

Murcia Zorrilla, Claudia Patricia

Sinisterra Diaz, María Mercedes

Abstract

This study analyzes recent scientific discourses on women’s participation in engineering and sustainability through a text mining approach applied to open-access publications. A total of 766 scientific articles published between 2020 and 2025 were collected from the PLOS ONE database using web scraping techniques in R. Based on their abstracts, a Latent Dirichlet Allocation (LDA) topic modeling was conducted, identifying five dominant discursive axes: reproductive health, gender-based violence, STEM education, maternal health, and community participation. The results reveal narrative patterns linking gender challenges to sustainability in scientific, educational, and social contexts. This work provides valuable evidence for the design of inclusive policies and encourages debate on gender equality in strategic disciplines for sustainable development.

Read Article