TY - JOUR T1 - An improved genome assembly uncovers a prolific tandem repeat structure in Atlantic cod JF - bioRxiv DO - 10.1101/060921 SP - 060921 AU - Ole K. Tørresen AU - Bastiaan Star AU - Sissel Jentoft AU - William Brynildsen Reinar AU - Harald Grove AU - Jason R. Miller AU - Brian P. Walenz AU - James Knight AU - Jenny M. Ekholm AU - Paul Peluso AU - Rolf B. Edvardsen AU - Ave Tooming-Klunderud AU - Morten Skage AU - Sigbjørn Lien AU - Kjetill S. Jakobsen AU - Alexander J. Nederbragt Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/06/27/060921.abstract N2 - Background: The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated from complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software enable the generation of more contiguous genome assemblies.Results: By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly has increased fifty-fold and the proportion of gap-bases has been reduced 15-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21 % of the TRs across the assembly, 19 % in the promoter regions and 12 % in the coding sequences are heterozygous in the sequenced individual.Conclusions: The use of multiple assembly programs combined with inclusion of PacBio reads drastically improved the Atlantic cod genome assembly by successfxully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which likely is of evolutionary importance. ER -