Dry Bean Classification. Analysis from Data Mining (#108)
Read ArticleDate of Conference
July 19-21, 2023
Published In
"Leadership in Education and Innovation in Engineering in the Framework of Global Transformations: Integration and Alliances for Integral Development"
Location of Conference
Buenos Aires
Authors
Castrillón, Omar Danilo
Giraldo, Jaime Alberto
Arango, Jaime Antero
Abstract
The classification of seeds is a fundamental factor in all agricultural processes, Maxime, if you want to obtain the highest possible profitability of the products once they can be harvested and avoid cumbersome manual processes. In this investigative work, we start with an analysis from data mining in order to establish the most influential independent variables in the classification of a seed, in this case, dry beans. Among the independent variables subject to analysis are: area, perimeter, longest axis length, shortest axis length, aspect, eccentricity, convex area, equivalent diameter, extension, solidity, roundness, chromaticity, form factor 1, form factor 2, form factor 3, form factor 4. A dependent variable called class is also established, which consists of seven states: Barbunya, Bombay, Cali, Dermason, Horoz, Seker, and Sira. In this research, the J48 algorithm of the machine learning and data mining platform called Weka is used in order to identify the class to which a seed can belong and the most influential independent variables in this process. As a result, it is found that with an effectiveness greater than 93%, the most influential variables are permitero, longer shaft length, and shorter shaft length.