News Categorisation Based on Pre-Trained Transformer Models (#1076)
Read ArticleDate of Conference
July 19-21, 2023
Published In
"Leadership in Education and Innovation in Engineering in the Framework of Global Transformations: Integration and Alliances for Integral Development"
Location of Conference
Buenos Aires
Authors
Espin-Riofrio, César
Murillo-Cepeda, Vanessa
García-Zambrano, David
Mendoza Morán, Verónica
Montejo-Ráez, Arturo
Zumba Gamboa, Johanna
Abstract
The rise of digital journalism, the amount of news and the continuous number of people accessing these contents, often generates that third parties through web platforms and social networks have the opportunity to persuade readers with content that alters their opinion or behaviour on a topic, so it is necessary to classify news using Natural Language Processing (NLP) techniques. This work seeks to experiment with pre-trained Transformer models using transfer learning and fine tuning to obtain a model capable of determining whether a news item is satire, opinion or information. To do so, we use a labelled dataset of news in English presented for the SemEval 2023 campaign, translating it into Spanish to experiment also in this language. We use pre-trained Transformer models for text classification tasks in the mentioned languages, thus, we compare several models and their predictions using evaluation metrics. The results give indications of the goodness of the models considering the type of news subjective, in the case of satire and opinion, and objective for information, thus contributing to future research related to text classification, specifically news categorisation.