News Categorisation Based on Pre-Trained Transformer Models

Espin-Riofrio, César; Murillo-Cepeda, Vanessa; García-Zambrano, David; Mendoza Morán, Verónica; Montejo-Ráez, Arturo; Zumba Gamboa, Johanna

News Categorisation Based on Pre-Trained Transformer Models (#1076)

Read Article

Date of Conference

July 19-21, 2023

Published In

"Leadership in Education and Innovation in Engineering in the Framework of Global Transformations: Integration and Alliances for Integral Development"

Location of Conference

Buenos Aires

Authors

Espin-Riofrio, César

Murillo-Cepeda, Vanessa

García-Zambrano, David

Mendoza Morán, Verónica

Montejo-Ráez, Arturo

Zumba Gamboa, Johanna

Abstract

The rise of digital journalism, the amount of news and the continuous number of people accessing these contents, often generates that third parties through web platforms and social networks have the opportunity to persuade readers with content that alters their opinion or behaviour on a topic, so it is necessary to classify news using Natural Language Processing (NLP) techniques. This work seeks to experiment with pre-trained Transformer models using transfer learning and fine tuning to obtain a model capable of determining whether a news item is satire, opinion or information. To do so, we use a labelled dataset of news in English presented for the SemEval 2023 campaign, translating it into Spanish to experiment also in this language. We use pre-trained Transformer models for text classification tasks in the mentioned languages, thus, we compare several models and their predictions using evaluation metrics. The results give indications of the goodness of the models considering the type of news subjective, in the case of satire and opinion, and objective for information, thus contributing to future research related to text classification, specifically news categorisation.

Read Article