%0 Journal Article %A Ungaro Arnaud %A Pech Nicolas %A Martin Jean-François %A McCairns R.J. Scott %A Chappaz Rémi %A Gilles André %T Challenges and solutions for transcriptome assembly in non-model organisms with an application to hybrid specimens %D 2016 %R 10.1101/084145 %J bioRxiv %P 084145 %X Analyses of high-throughput transcriptome sequences of non-model organisms are based on three main approaches: de novo assembly, genome-guided assembly, or direct read to genome mapping (DGM). We describe a flexible DGM pipeline, and demonstrate its performance by using simulated reads of lengths corresponding to those generated by the most common sequencing platforms, and over a realistic range of genetic divergence. We also evaluate the performance of a combined pipeline (de novo + DGM) via simulation and empirically, using data from two hybridizing Cyprinid fish species. Finally, we explore the assignation of F1 hybrids reads to their parental species, and discuss the implications of erroneous assignations on gene expression studies. Our DGM pipeline recovers 94.8% of the genes irrespective of read length at 0% divergence; however, assignation rate of reads is negatively impacted both by increasing divergence level and reducing read lengths. Likewise, our combined de novo + DGM pipeline outperforms de novo analyses alone at all levels of divergence and the read length. %U https://www.biorxiv.org/content/biorxiv/early/2016/10/28/084145.full.pdf