Abstract
Herpesviruses (HVs) have large genomes that can encode thousands of proteins. Apart from amino acid mutations, protein domain acquisitions, duplications and losses are also common modes of evolution. HV domain repertoires differ across species, and only a core set is shared among all viruses, aspect that raises a question: How have HV domain repertoires diverged while keeping some similarities? To answer such question, we used profile HMMs to search for domains in all possible translated ORFs of fully sequenced HV genomes. With at least 274 domains being identified, we built a matrix of domain counts per species, and applied a parsimony method to reconstruct the ancestral states of these domains along the HV phylogeny. It revealed events of domain gain, duplication and loss over more than 400 millions of years, where Alpha-, Beta- and Gammaherpesviruses expanded and condensed their domain repertoires at distinct rates. Most of the acquired domains perform ‘Modulation and Control’, ‘Envelope’ or ‘Auxiliary’ functions, categories that showed high flexibility (number of domains) and redundancy (number of copies). Conversely, few gains and duplications were observed for domains involved in ‘Capsid assembly and structure’, and ‘DNA Replication, recombination and metabolism’. Among the 41 primordial domains encoded by herpesvirus ancestors, 28 are still found in all present-day HVs. Because of their distinct evolutionary strategies, herpesvirus domain repertoires are very specific at the subfamily, genus and species levels. Differences in domain composition may not just explain HV host range and tissue tropism, but also provide hints to the origins of herpesviruses.