Abstract
Energy functions are fundamental to biomolecular modeling. Their success depends on robust physical formalisms, efficient optimization, and high-resolution data for training and validation. Over the past 20 years, progress in each area has advanced soluble protein energy functions. Yet, energy functions for membrane proteins lag behind due to sparse and low-quality data, leading to overfit tools. To overcome this challenge, we assembled a suite of 12 tests on independent datasets varying in size, diversity, and resolution. The tests probe an energy function’s ability to capture membrane protein orientation, stability, sequence, and structure. Here, we present the tests and use the franklin2019 energy function to demonstrate them. We then present a vision for transforming these “small” datasets into “big data” that can be used for more sophisticated energy function optimization. The tests are available through the Rosetta Benchmark Server (https://benchmark.graylab.jhu.edu/) and GitHub (https://github.com/rfalford12/Implicit-Membrane-Energy-Function-Benchmark).
Competing Interest Statement
Dr. Gray is an unpaid board member of the Rosetta Commons. Under institutional participation agreements between the University of Washington, acting on behalf of the Rosetta Commons, Johns Hopkins University may be entitled to a portion of revenue received on licensing Rosetta software including programs described here. As a member of the Scientific Advisory Board, JJG has a financial interest in Cyrus Biotechnology. Cyrus Biotechnology distributes the Rosetta software, which may include methods described in this paper.