TY - JOUR T1 - Retroviruses integrate into a shared, non-palindromic motif JF - bioRxiv DO - 10.1101/034991 SP - 034991 AU - Paul D. W. Kirk AU - Maxime Huvet AU - Anat Melamed AU - Goedele N. Maertens AU - Charles R. M. Bangham Y1 - 2015/01/01 UR - http://biorxiv.org/content/early/2015/12/20/034991.abstract N2 - Palindromic consensus nucleotide sequences are found at the genomic integration sites of retroviruses and other transposable elements. It has been suggested that the palindromic consensus arises as a consequence of structural symmetry in the integrase complex, but the precise mechanism has yet to be elucidated. Here we perform a statistical analysis of large datasets of HTLV-1 and HIV-1 integration sites. The results show that the palindromic consensus sequence is not present in individual integration sites, but appears to arise in the population average as a consequence of the existence of a non-palindromic nucleotide motif that occurs in approximately equal proportions on the plus-strand and the minus-strand of the host genome. We demonstrate that palindromic probability position matrices are characteristic of such situations. We develop a generally applicable algorithm to sort the individual integration site sequences into plus-strand and minus-strand subpopulations. We apply this algorithm to identify the respective integration site nucleotide motifs of five retroviruses of different genera: HTLV-1, HIV-1, MLV, ASLV, and PFV. The results reveal a non-palindromic motif that is shared between these retroviruses. ER -