Abstract
De novo transcriptome construction from short-read RNA-seq is a common method for reconstructing mRNA transcripts within a given sample. However, the precision of this process is unclear as it is difficult to obtain a ground-truth measure of transcript expression. With advances in third generation sequencing, full length transcripts of whole transcriptomes can be accurately sequenced to generate a ground-truth transcriptome. We generated long-read PacBio and short-read Illumina RNA-seq data from a human induced pluripotent stem cell- derived retinal pigmented epithelium (iPSC-RPE) cell line. We use long-read data to identify simple metrics for assessing de novo transcriptome construction and optimize a short-read based de novo transcriptome construction pipeline. We apply this this pipeline to construct transcriptomes for 340 short-read RNA-seq samples originating from healthy adult and fetal human retina, cornea, and RPE. We identify hundreds of novel gene isoforms and examine their significance in the context of ocular development and disease.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
The figure 2 Caption was was expanded for clarity; The citation format was changed; The abstract and introduction were slightly modified to remove areas of ambiguity. Changed title to reduce character count for journal submission
https://github.com/vinay-swamy/ocular_transcriptomes_pipeline
https://github.com/vinay-swamy/ocular_transcriptomes_longread_analysis