Abstract
Prediction of breeding values and phenotypes is central to plant breeding and has been revolutionized by the adoption of genomic selection (GS). Use of machine and deep learning algorithms applied to complex traits in plants can improve prediction accuracies in the context of GS. Spectral reflectance indices further provide information about various physiological parameters previously undetectable in plants. This research explores the potential of multi-trait (MT) machine and deep learning models for predicting grain yield and grain protein content in wheat using spectral information in GS models. This study compares the performance of four machine and deep learning-based uni-trait (UT) and MT models with traditional GBLUP and Bayesian models. The dataset consisted of 650 recombinant inbred lines from a spring wheat breeding program, grown for three years (2014-2016), and spectral data were collected at heading and grain filling stages. MT-GS models performed 0-28.5% and −0.04-15% superior to the UT-GS models for predicting grain yield and grain protein content. Random forest and multilayer perceptron were the best performing machine and deep learning models to predict both traits. These two models performed similarly under UT and MT-GS models. Four explored Bayesian models gave similar accuracies, which were less than machine and deep learning-based models, and required increased computational time. Green normalized difference vegetation index best predicted grain protein content in seven out of the nine MT-GS models. Overall, this study concluded that machine and deep learning-based MT-GS models increased prediction accuracy and should be employed in large-scale breeding programs.
Core Ideas
Potential for combining high throughput phenotyping, machine and deep learning in breeding.
Multi-trait models exploit information from secondary correlated traits efficiently.
Spectral information improves genomic selection models.
Deep learning can aid plant breeders owing to increased data generated in breeding programs
Competing Interest Statement
The authors have declared no competing interest.
Abbreviations
- ARI
- anthocyanin reflectance index
- CNN
- convolutional neural network
- GBLUP
- genomic best linear unbiased predictor
- GEBVs
- genomic estimated breeding values
- GNDVI
- green normalized difference vegetation index
- GS
- genomic selection
- MLP
- multilayer perceptron
- MT
- multi-trait
- NCPI
- normalized chlorophyll pigment ratio index
- NDVI
- normalized difference vegetation index
- NWI
- normalized water index
- PRI
- photochemical reflectance index
- RF
- random forest
- SVM
- support vector machine
- UT
- uni-trait