Comparative Evaluation of Gemini and Copilot Performance in University Entrance Exams: A Systematic Analysis Based on Multiple-Choice Questions and Images

Medina Llerena, Diego Alonso; Velarde Lam, Diego Manuel

Comparative Evaluation of Gemini and Copilot Performance in University Entrance Exams: A Systematic Analysis Based on Multiple-Choice Questions and Images (#431)

Read Article

Date of Conference

December 1-3, 2025

Published In

"Entrepreneurship with Purpose: Social and Technological Innovation in the Age of AI"

Location of Conference

Cartagena

Authors

Medina Llerena, Diego Alonso

Velarde Lam, Diego Manuel

Abstract

The objective of this research was to compare the performance of Gemini and Copilot in solving multiple-choice questions, interpreting texts and images, for the entrance exams of a prestigious Peruvian university across its various faculties over the past three years. This study analyzed 838 questions, of which 83 were analyzed as images. The overall results indicate a higher proportion of correct answers for Copilot, at 75% (627/838) versus 67% (561/838) for Gemini. The performance of both AIs was significantly lower in image analysis, with correct answers of 36.1% (30/83) for Gemini and 39.8% (33/83) for Copilot. In conclusion, these findings highlight the need to improve accuracy in image processing, as well as the importance of understanding its current limitations to optimize its performance and integration into the academic field.

Read Article