Abstract
Norway spruce (Picea abies (L.) Karst.) is a conifer species with large economic and ecological importance. As with most conifers, the P. abies genome is very large (~20 Gbp) and contains high levels of repetitive DNA. The current genome assembly (v1.0) covers approximately 60% of the total genome size, but is highly fragmented consisting of more than 10 million scaffolds. Even though 66,632 protein coding gene models are annotated, the fragmented nature of the assembly means that there is currently little information available on how these genes are physically distributed over the 12 P. abies chromosomes. By creating an ultra-dense genetic linkage map, we can anchor and order scaffolds at the pseudo-chromosomal level in P. abies, which complements the fine-scale information available in the assembly contigs. Our ultra dense haploid consensus genetic map consists of 15,005 markers from 14,336 scaffolds and where 17,079 gene models (25.6% of protein coding gene annotations) have been anchored to the 12 linkage groups (pseudo-chromosomes). Three independent component maps, as well as comparisons to earlier published Picea maps are used to evaluate the accuracy and marker order of the linkage groups. We can demonstrate that approximately 3.8% of the scaffolds and 1.6% of the gene models covered by the consensus map are likely wrongly assembled as they contain genetic markers that map to different regions or linkage groups of the P. abies linkage map. We also evaluate the utility of the genetic map for the conifer research community by using an independent data set of unrelated individuals to assess genome-wide variation in genetic diversity using the genomic regions anchored to chromosomes. The results show that our map is dense enough to allow detailed evolutionary analysis across the P. abies genome.