Assessment of clustering techniques to support the analyses of soybean seed vigor

PLoS One. 2023 Aug 25;18(8):e0285566. doi: 10.1371/journal.pone.0285566. eCollection 2023.

Abstract

Soy is the main product of Brazilian agriculture and the fourth most cultivated bean globally. Since soy cultivation tends to increase and due to this large market, the guarantee of product quality is an indispensable factor for enterprises to stay competitive. Industries perform vigor tests to acquire information and evaluate the quality of soy planting. The tetrazolium test, for example, provides information about moisture damage, bedbugs, or mechanical damage. However, the verification of the damage reason and its severity are done by an analyst, one by one. Since this is massive and exhausting work, it is susceptible to mistakes. Proposals involving different supervised learning approaches, including active learning strategies, have already been used, and have brought significant results. Therefore, this paper analyzes the performance of non-supervised techniques for classifying soybeans. An extensive experimental evaluation was performed, considering (9) different clustering algorithms (partitional, hierarchical, and density-based) applied to 5 image datasets of soybean seeds submitted to the tetrazolium test, including different damages and/or their levels. To describe those images, we considered 18 extractors of traditional features. We also considered four metrics (accuracy, FOWLKES, DAVIES, and CALINSKI) and two-dimensionality reduction techniques (principal component analysis and t-distributed stochastic neighbor embedding) for validation. Results show that this paper presents essential contributions since it makes it possible to identify descriptors and clustering algorithms that shall be used as preprocessing in other learning processes, accelerating and improving the classification process of key agricultural problems.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Agriculture*
  • Algorithms
  • Cluster Analysis
  • Glycine max*
  • Seeds
  • Tetrazolium Salts

Substances

  • Tetrazolium Salts

Grants and funding

This work has been supported by National Council for Scientific and Technological Development - CNPq; Coordination for the Improvement of Higher Education Personnel - CAPES; Funda\c{c}\~{a}o Arauc\’{a}ria; SETI; UTFPR; and UFSCar. There was no additional external funding received for this study.