Exploring Scientific Discourses on Women in Engineering and Sustainability through Web Scraping and LDA Analysis with R (2020–2025) (#844)
Read ArticleDate of Conference
December 1-3, 2025
Published In
"Entrepreneurship with Purpose: Social and Technological Innovation in the Age of AI"
Location of Conference
Cartagena
Authors
Murcia Zorrilla, Claudia Patricia
Sinisterra Diaz, María Mercedes
Abstract
This study analyzes recent scientific discourses on women’s participation in engineering and sustainability through a text mining approach applied to open-access publications. A total of 766 scientific articles published between 2020 and 2025 were collected from the PLOS ONE database using web scraping techniques in R. Based on their abstracts, a Latent Dirichlet Allocation (LDA) topic modeling was conducted, identifying five dominant discursive axes: reproductive health, gender-based violence, STEM education, maternal health, and community participation. The results reveal narrative patterns linking gender challenges to sustainability in scientific, educational, and social contexts. This work provides valuable evidence for the design of inclusive policies and encourages debate on gender equality in strategic disciplines for sustainable development.