TY - JOUR T1 - Protein Collapse is Encoded in the Folded State Architecture JF - bioRxiv DO - 10.1101/070920 SP - 070920 AU - Himadri S. Samanta AU - Pavel I. Zhuravlev AU - Michael Hinczewski AU - Naoto Hori AU - Shaon Chakrabarti AU - D. Thirumalai Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/08/22/070920.abstract N2 - The propensity of single domain globular proteins, the workhorses in cells, to be compact is the key reason that their folded states achieve high packing density. It is known that the radius of gyration, Rg, of both the folded and unfolded (created by adding denaturants) states increase as Nv where N is the number of amino acids in the protein. The values of the celebrated Flory exponent v are, respectively, ≈ ⅓ and ≈ 0.6 in the folded and unfolded states, which coincide with those found in homopolymers in poor and good solvents. However, the extent of compaction of the unfolded state of a protein under low denaturant concentration, conditions favoring the formation of the folded state, is unknown. This problem which goes to the heart of how proteins fold and has implications for the evolution of foldable sequences is unsolved. We develop a theory based on polymer physics concepts that uses the contact map of proteins as input to quantitatively assess collapsibility of proteins. The model, which includes only two-body excluded volume interaction and interactions reflecting the strength of the contact map, has only expanded and compact states. Surprisingly, we find that although protein collapsibility is universal, the propensity to be compact depends on the protein architecture. Application of the theory to over two thousand proteins shows that the extent of collapsibility depends not only on N but also on the contact map reflecting the native fold structure. A major prediction of the theory is that ß-sheet proteins are far more collapsible than structures dominated by α-helices. The theory fully resolves the apparent controversy between conclusions reached using different experimental probes assessing the extent of compaction of a couple proteins. In addition, it reveals that there are considerable similarities between the physical mechanisms of homopolymer and protein collapse. The theory provides quantitative insights into the reasons why single domain proteins are small and the physical reasons for the origin of multi-domain proteins. We also show that non-coding RNA molecules, whose collapsibility is similar to proteins with ß-sheet structures, must undergo collapse prior to folding, adding support to “Compaction Selection Hypothesis” proposed in the context of RNA compaction. ER -