Abstract
Genetic variation in populations of Middle Eastern origin remains highly underrepresented in most comprehensive genomic databases. This underrepresentation hampers the functional annotation of the human genome, and also challenges accurate clinical variant interpretation. To highlight the importance of capturing genetic variation in the Middle East, we aggregate whole exome and genome sequencing data from 2,116 individuals in the Middle East and establish the Middle East Variation (MEV) database. Of the high impact coding variants in this database, 34% were absent from the most comprehensive Genome Aggregation Database (gnomAD), thus representing unique Middle Eastern variation which might directly impact clinical variant interpretation. We highlight 167 variants with MAF >1% in the MEV database which were previously reported as rare disease variants in ClinVar and the Human Gene Mutation Database (HGMD). Furthermore, the MEV database consisted of 365 homozygous loss of function (LoF) variants, the majority of which (239/365, 65.5%) were absent from gnomAD, representing complete knockouts of 229 unique genes in reportedly healthy individuals. Intriguingly, 58 of those genes have several clinically significant variants reported in ClinVar and HGMD. Our study shows that genetic variation in the Middle East improves functional annotation and clinical interpretation of the genome and emphasizes the need for expanding sequencing studies in the Middle East and other underrepresented populations.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Conflicts of Interest: The authors declare no conflicts of interest.