Abstract
Background Genomic prediction models were, in principle, developed to include all the available marker information; with this approach, these models have shown in various crops moderate to high predictive accuracies. Previous studies in cassava have demonstrated that, even with relatively small training populations and low-density GBS markers, prediction models are feasible for genomic selection. In the present study, we prioritized SNPs in close proximity to genome regions with biological importance for a given trait. We used a number of strategies to select variants that were then included in single and multiple kernel GBLUP models. Specifically, our sources of information were transcriptomics, GWAS, and immunity-related genes, with the ultimate goal to increase predictive accuracies for Cassava Brown Streak Disease (CBSD) severity.
Results We used single and multi-kernel GBLUP models with markers imputed to whole genome sequence level to accommodate various sources of biological information; fitting more than one kinship matrix allowed for differential weighting of the individual marker relationships. We applied these GBLUP approaches to CBSD phenotypes (i.e., root infection and leaf severity three and six months after planting) in a Ugandan Breeding Population (n = 955). Three means of exploiting an established RNAseq experiment of CBSD-infected cassava plants were used. Compared to the biology-agnostic GBLUP model, the accuracy of the informed multi-kernel models increased the prediction accuracy only marginally (1.78% to 2.52%).
Conclusions Our results show that markers imputed to whole genome sequence level do not provide enhanced prediction accuracies compared to using standard GBS marker data in cassava. The use of transcriptomics data and other sources of biological information resulted in prediction accuracies that were nominally superior to those obtained from traditional prediction models.
Abbreviations
- GS
- Genomic Selection
- GWAS
- Genome-Wide Association Studies
- CBSD
- Cassava Brown Streak Disease
- CBSV
- Cassava Brown Streak Virus
- UCBSV
- Ugandan Cassava Brown Streak Virus
- GBS
- Genotyping-By-Sequencing
- BLUP
- Best Linear Unbiased Prediction
- GEBV
- Genomic Estimated Breeding Values
- LD
- Linkage Disequilibrium
- SNP
- Single Nucleotide Polymorphism
- DE
- Differentially Expressed
- EGV
- Estimated Genetic Value