PT - JOURNAL ARTICLE AU - Sankar Basu AU - Fredrik Söderquist AU - Björn Wallner TI - Proteus: A Random Forest Classifier that Predicts Disorder-to-Order Transitioning Binding Regions in Intrinsically Disordered Proteins AID - 10.1101/080788 DP - 2016 Jan 01 TA - bioRxiv PG - 080788 4099 - http://biorxiv.org/content/early/2016/10/14/080788.short 4100 - http://biorxiv.org/content/early/2016/10/14/080788.full AB - The focus of the computational structural biology community has taken a dramatic shift over the past one-and-a-half decade from the classical protein structure prediction problem to the possible understanding of intrinsically disordered proteins (IDP) or proteins containing regions of disorder (IDPR). The current interest lies in the unraveling of a disorder-to-order transitioning code embedded in the amino acid sequences of IDPs overtaking the well established sequence to structure paradigm. Disordered proteins are characterized by enormous amount of structural plasticity which makes them promiscuous in binding to different partners, multi-functional in cellular activity and atypical in folding energy landscapes resembling partially folded molten globules. Also, their involvement in several human diseases including cancer, cardiovascular, and neurodegenerative diseases makes them both attractive as drug targets, as well as important for a biochemical understanding of the diseases. The study the structural ensemble of IDPs is rather difficult, in particular for transient interactions. When bound to a partner the IDPRs adapt to an an ordered structure in the complex. The residues that undergo this disorder-to-order transition are called protean residues, and the first step in understanding the interaction with a disordered partner would be to predict the residues that are responsible for the interaction and will undergo disorder-to-order transition, i.e. the protean residues. There are a few available methods which predict these protean segments given their amino acid sequences, however, their performance reported in the literature leaves clear room for improvement. In this background, the current study presents 'Proteus', a random-forest-based protean predictor that predicts the likelihood of a residue to undergo disorder-to-order transition upon binding to a partner protein. The prediction is based on features that can be calculated using the amino acid sequence alone. Proteus compares favorably with existing methods predicting twice as many true positives as the second best method (55% vs. 27%) at a much higher precision on an independent data set. The current study also shades some light on a possible 'disorder-to-order' transitioning consensus, untangled, yet embedded in the amino acid sequence of IDPs. Some guidelines have also been suggested to proceed for a real-life structural modeling of an IDPR using Proteus.