De novo assembly and characterization of root transcriptome using Illumina paired-end sequencing and development of cSSR markers in sweet potato (Ipomoea batatas)

BMC Genomics. 2010 Dec 24:11:726. doi: 10.1186/1471-2164-11-726.

Abstract

Background: The tuberous root of sweet potato is an important agricultural and biological organ. There are not sufficient transcriptomic and genomic data in public databases for understanding of the molecular mechanism underlying the tuberous root formation and development. Thus, high throughput transcriptome sequencing is needed to generate enormous transcript sequences from sweet potato root for gene discovery and molecular marker development.

Results: In this study, more than 59 million sequencing reads were generated using Illumina paired-end sequencing technology. De novo assembly yielded 56,516 unigenes with an average length of 581 bp. Based on sequence similarity search with known proteins, a total of 35,051 (62.02%) genes were identified. Out of these annotated unigenes, 5,046 and 11,983 unigenes were assigned to gene ontology and clusters of orthologous group, respectively. Searching against the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG) indicated that 17,598 (31.14%) unigenes were mapped to 124 KEGG pathways, and 11,056 were assigned to metabolic pathways, which were well represented by carbohydrate metabolism and biosynthesis of secondary metabolite. In addition, 4,114 cDNA SSRs (cSSRs) were identified as potential molecular markers in our unigenes. One hundred pairs of PCR primers were designed and used for validation of the amplification and assessment of the polymorphism in genomic DNA pools. The result revealed that 92 primer pairs were successfully amplified in initial screening tests.

Conclusion: This study generated a substantial fraction of sweet potato transcript sequences, which can be used to discover novel genes associated with tuberous root formation and development and will also make it possible to construct high density microarrays for further characterization of gene expression profiles during these processes. Thousands of cSSR markers identified in the present study can enrich molecular markers and will facilitate marker-assisted selection in sweet potato breeding. Overall, these sequences and markers will provide valuable resources for the sweet potato community. Additionally, these results also suggested that transcriptome analysis based on Illumina paired-end sequencing is a powerful tool for gene discovery and molecular marker development for non-model species, especially those with large and complex genome.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics
  • Base Sequence
  • DNA, Complementary / genetics*
  • Databases, Genetic
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation, Plant
  • Genes, Plant / genetics
  • Genetic Markers
  • Ipomoea batatas / genetics*
  • Minisatellite Repeats / genetics*
  • Molecular Sequence Annotation
  • Molecular Sequence Data
  • Plant Roots / genetics*
  • Regulatory Sequences, Nucleic Acid / genetics
  • Sequence Analysis, DNA / methods*

Substances

  • DNA, Complementary
  • Genetic Markers