A real-time phenotyping framework using machine learning for plant stress severity rating in soybean

Plant Methods. 2017 Apr 8:13:23. doi: 10.1186/s13007-017-0173-7. eCollection 2017.

Abstract

Background: Phenotyping is a critical component of plant research. Accurate and precise trait collection, when integrated with genetic tools, can greatly accelerate the rate of genetic gain in crop improvement. However, efficient and automatic phenotyping of traits across large populations is a challenge; which is further exacerbated by the necessity of sampling multiple environments and growing replicated trials. A promising approach is to leverage current advances in imaging technology, data analytics and machine learning to enable automated and fast phenotyping and subsequent decision support. In this context, the workflow for phenotyping (image capture → data storage and curation → trait extraction → machine learning/classification → models/apps for decision support) has to be carefully designed and efficiently executed to minimize resource usage and maximize utility. We illustrate such an end-to-end phenotyping workflow for the case of plant stress severity phenotyping in soybean, with a specific focus on the rapid and automatic assessment of iron deficiency chlorosis (IDC) severity on thousands of field plots. We showcase this analytics framework by extracting IDC features from a set of ~4500 unique canopies representing a diverse germplasm base that have different levels of IDC, and subsequently training a variety of classification models to predict plant stress severity. The best classifier is then deployed as a smartphone app for rapid and real time severity rating in the field.

Results: We investigated 10 different classification approaches, with the best classifier being a hierarchical classifier with a mean per-class accuracy of ~96%. We construct a phenotypically meaningful 'population canopy graph', connecting the automatically extracted canopy trait features with plant stress severity rating. We incorporated this image capture → image processing → classification workflow into a smartphone app that enables automated real-time evaluation of IDC scores using digital images of the canopy.

Conclusion: We expect this high-throughput framework to help increase the rate of genetic gain by providing a robust extendable framework for other abiotic and biotic stresses. We further envision this workflow embedded onto a high throughput phenotyping ground vehicle and unmanned aerial system that will allow real-time, automated stress trait detection and quantification for plant research, breeding and stress scouting applications.

Keywords: High-throughput phenotyping; Image analysis; Machine learning; Plant stress; Smartphone.