Twelve years of SAMtools and BCFtools

P Danecek, JK Bonfield, J Liddle, J Marshall… - …, 2021 - academic.oup.com
Background SAMtools and BCFtools are widely used programs for processing and analysing
high-throughput sequencing data. They include tools for file format conversion and …

The variant call format and VCFtools

P Danecek, A Auton, G Abecasis, CA Albers… - …, 2011 - academic.oup.com
The variant call format (VCF) is a generic format for storing DNA polymorphism data such as
SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is …

[HTML][HTML] Mouse genomic variation and its effect on phenotypes and gene regulation

TM Keane, L Goodstadt, P Danecek, MA White… - Nature, 2011 - nature.com
We report genome sequences of 17 inbred strains of laboratory mice and identify almost ten
times more variants than previously known. We use these genomes to explore the …

Reference-based phasing using the Haplotype Reference Consortium panel

PR Loh, P Danecek, PF Palamara, C Fuchsberger… - Nature …, 2016 - nature.com
… All differences were significant (binomial P ≤ 0.003 for each comparison of Eagle2 with
SHAPEIT2, P ≤ 0.02 for each comparison with Eagle1, and P = 10 −21 for each comparison …

Evidence for 28 genetic disorders discovered by combining healthcare and research data

…, JF McRae, PJ Short, RI Torene, E de Boer, P Danecek… - Nature, 2020 - nature.com
P values from analysis of the undiagnosed subset for discordant and novel genes; P values
for consensus genes come from the full cohort analysis. The number of genes in each P-…

Whole‐genome sequencing identifies EN1 as a determinant of bone density and fracture

…, K Trajanoska, Y Memari, J Min, J Huang, P Danecek… - Nature, 2015 - nature.com
The extent to which low‐frequency (minor allele frequency (MAF) between 1–5%) and rare (MAF
≤ 1%) variants contribute to complex traits and disease in the general population is …

Common genetic variation drives molecular heterogeneity in human iPSCs

…, D Bensaddek, FP Casale, OJ Culley, P Danecek… - Nature, 2017 - nature.com
… This was defined as the number of control variant sets (n) that showed a higher overlap with
the target annotation than the eQTL lead variants (P = n / 100). The empirical P values were …

Insights into human genetic variation and population history from 929 diverse genomes

…, Q Ayub, P Danecek, Y Chen, S Felkel, P Hallast… - Science, 2020 - science.org
INTRODUCTION Large-scale human genome-sequencing studies to date have been limited
to large, metropolitan populations or to small numbers of genomes from each group. Much …

EIDA: The European integrated data archive and service infrastructure within ORFEUS

…, D Cambaz, J Clinton, P Danecek… - Seismological …, 2021 - pubs.geoscienceworld.org
The European Integrated Data Archive (EIDA) is the infrastructure that provides access to
the seismic‐waveform archives collected by European agencies. This distributed system is …

[HTML][HTML] A comparative phenotypic and genomic analysis of C57BL/6J and C57BL/6N mouse strains

…, H Cater, MF Champy, R Combe, P Danecek… - Genome biology, 2013 - Springer
Background The mouse inbred line C57BL/6J is widely used in mouse genetics and its
genome has been incorporated into many genetic reference populations. More recently large …