Abstract
The identification of stabilizing amino acid substitutions in proteins is a key challenge in protein engineering. Advances in biotechnology have enabled assaying of thousands of protein variants in a single high-throughput experiment, and more recent studies use such data in protein engineering. We present a Global Multi-Mutant Analysis (GMMA) that exploits the presence of multiply-substituted variants to identify individual substitutions that stabilize the functionally-relevant state of a protein. GMMA identifies substitutions that stabilize in different sequence contexts that thus may be combined to achieve improved stability. We have applied GMMA to >54,000 variants of green fluorescent protein (GFP) each carrying 1-15 amino acid substitutions. The method is transparent with a physical interpretation of the estimated parameters and related uncertainties. We show that using only this single experiment as input, GMMA is able to identify nearly all of the substitutions previously reported to be beneficial for GFP folding and function.
Competing Interest Statement
The authors have declared no competing interest.