Wood identification of Cyclobalanopsis (Endl.) Oerst based on microscopic features and CTGAN-enhanced explainable machine learning models

Front Plant Sci. 2023 Jul 7:14:1203836. doi: 10.3389/fpls.2023.1203836. eCollection 2023.

Abstract

Introduction: Accurate and fast identification of wood at the species level is critical for protecting and conserving tree species resources. The current identification methods are inefficient, costly, and complex.

Methods: A wood species identification model based on wood anatomy and using the Cyclobalanopsis genus wood cell geometric dataset was proposed. The model was enhanced by the CTGAN deep learning algorithm and used a simulated cell geometric feature dataset. The machine learning models BPNN and SVM were trained respectively for recognition of three Cyclobalanopsis species with simulated vessel cells and simulated wood fiber cells.

Results: The SVM model and BPNN model achieved recognition accuracy of 96.4% and 99.6%, respectively, on the real dataset, using the CTGAN-generated vessel dataset. The BPNN model and SVM model achieved recognition accuracy of 75.5% and 77.9% on real dataset, respectively, using the CTGAN-generated wood fiber dataset.

Discussion: The machine learning model trained based on the enhanced cell geometric feature data by CTGAN achieved good recognition of Cyclobalanopsis, with the SVM model having a higher prediction accuracy than BPNN. The machine learning models were interpreted based on LIME to explore how they identify tree species based on wood cell geometric features. This proposed model can be used for efficient and cost-effective identification of wood species in industrial applications.

Keywords: CTGAN; Cyclobalanopsis (Endl.) Oerst; LIME; machine learning; wood identification.

Grants and funding

This study was supported by Forestry Promotion Project of Fujian Province in China (Grant #: 2021TG13).