Abstract
Despite the crucial role played by the immunoglobulin heavy variable (IGHV) and T cell beta variable (TRBV) loci in adaptive immune function, their genetic variation in the human population remains poorly characterized. Generated through a process of gene duplication/deletion and diversification, these loci can vary extensively between individuals in copy number and contain genes that are highly similar. These characteristics make the identification and analysis of these loci technically challenging. Here, we present a comprehensive study of the functional gene segments in the IGHV and TRBV loci, quantifying their copy number and single nucleotide variation in a globally diverse sample of 109 (IGHV) and 286 (TRBV) humans. We find that despite the shared molecular and functional characteristics between the IGHV and TRBV gene families, they exhibit starkly different patterns of variation. In particular, we estimate that there are hundreds of copy number haplotypes in the IGHV locus (instances that have differences in the number of functional gene segments), while the TRBV locus has only a few copy number haplotypes. We also find that the TRBV locus has a greater or at least equal propensity to mutate, as evidenced by greater single nucleotide variation, compared to the IGHV locus. This is consistent with the observation that, across multiple species, the TRBV gene family is more diverse than the IGHV gene family. As well as suggesting that the IGHV and TRBV loci have taken different paths over their evolutionary history, these results also indicate that adaptive immune repertoire sequencing and analysis may be much more challenging for immunoglobulins than T cell receptors.