Abstract
Proteins are the chief effectors of cell biology and their functions are typically carried out in the context of multi-protein assemblies; large collections of such interacting protein assemblies are often referred to as interactomes. Knowing the constituents of protein complexes is therefore important for investigating their molecular biology. Many experimental methods are capable of producing data of use for detecting and inferring the existence of physiological protein complexes. Each method has associated pros and cons, affecting the potential quality and utility of the data. Numerous informatic resources exist for the curation, integration, retrieval, and processing of protein interactions data. While each resource may possess different merits, none are definitive and few are wieldy, potentially limiting their effective use by non-experts. In addition, contemporary analyses suggest that we may still be decades away from a comprehensive map of a human protein interactome. Taken together, we are currently unable to maximally impact and improve biomedicine from a protein interactome perspective – motivating the development of experimental and computational techniques that help investigators to address these limitations. Here, we present a resource intended to assist investigators in (i) navigating the cumulative knowledge concerning protein complexes and (ii) forming hypotheses concerning protein interactions that may yet lack conclusive evidence, thus (iii) directing future experiments to address knowledge gaps. To achieve this, we integrated multiple data-types/different properties of protein interactions from multiple sources and after applying various methods of regularization, compared the protein interaction networks computed to those available in the EMBL-EBI Complex Portal, a manually curated, gold-standard catalog of macromolecular complexes. As a result, our resource provides investigators with reliable curation of bona fide and candidate physical interactors of their protein or complex of interest, prompting due scrutiny and further validation when needed. We believe this information will empower a wider range of experimentalists to conduct focused protein interaction studies and to better select research strategies that explicitly target missing information.