TY - JOUR T1 - Genome Wide Variant Analysis of Simplex Autism Families with an Integrative Clinical-Bioinformatics Pipeline JF - bioRxiv DO - 10.1101/019208 SP - 019208 AU - Laura T. Jiménez-Barrón AU - Jason A. O’Rawe AU - Yiyang Wu AU - Margaret Yoon AU - Han Fang AU - Ivan Iossifov AU - Gholson J. Lyon Y1 - 2015/01/01 UR - http://biorxiv.org/content/early/2015/05/11/019208.abstract N2 - Autism spectrum disorders (ASD) are a group of developmental disabilities that affect social interaction, communication and are characterized by repetitive behaviors. There is now a large body of evidence that suggests a complex role of genetics in ASD, in which many different loci are involved. Although many current population scale genomic studies have been demonstrably fruitful, these studies generally focus on analyzing a limited part of the genome or use a limited set of bioinformatics tools. These limitations preclude the analysis of genome-wide perturbations that may contribute to the development and severity of ASD-related phenotypes. To overcome these limitations, we have developed and utilized an integrative clinical and bioinformatics pipeline for generating a more complete and reliable set of genomic variants for downstream analyses. Our study focuses on the analysis of three simplex autism families consisting of one affected child, unaffected parents, and one unaffected sibling. All members were clinically evaluated and widely phenotyped. Genotyping arrays and whole genome sequencing were performed on each member, and the resulting sequencing data were analyzed using a variety of available bioinformatics tools. We searched for rare variants of putative functional impact that were found to be segregating according to de-novo, autosomal recessive, x-linked, mitochondrial and compound heterozygote transmission models. The resulting candidate variants included three small heterozygous CNVs, a rare heterozygous de novo nonsense mutation in MYBBP1A located within exon 1, and a novel de novo missense variant in LAMB3. Our work demonstrates how more comprehensive analyses that include rich clinical data and whole genome sequencing data can generate reliable results for use in downstream investigations. We are moving to implement our framework for the analysis and study of larger cohorts of families, where statistical rigor can accompany genetic findings.(SNP)Single-nucleotide polymorphism(CNV)copy number variation(INDELs)insertions and deletions(SV)structual variant(WGS)whole genome sequencing(WES)whole exome sequencing(NGS)next-generation sequencing(base pair)bp(kilo base pairs)Kb(megabase pair)Mb(polymerase chain reaction)PCR ER -