Protein NMR structure determination with automated NOE assignment using the new software CANDID and the torsion angle dynamics algorithm DYANA

J Mol Biol. 2002 May 24;319(1):209-27. doi: 10.1016/s0022-2836(02)00241-3.

Abstract

Combined automated NOE assignment and structure determination module (CANDID) is a new software for efficient NMR structure determination of proteins by automated assignment of the NOESY spectra. CANDID uses an iterative approach with multiple cycles of NOE cross-peak assignment and protein structure calculation using the fast DYANA torsion angle dynamics algorithm, so that the result from each CANDID cycle consists of exhaustive, possibly ambiguous NOE cross-peak assignments in all available spectra and a three-dimensional protein structure represented by a bundle of conformers. The input for the first CANDID cycle consists of the amino acid sequence, the chemical shift list from the sequence-specific resonance assignment, and listings of the cross-peak positions and volumes in one or several two, three or four-dimensional NOESY spectra. The input for the second and subsequent CANDID cycles contains the three-dimensional protein structure from the previous cycle, in addition to the complete input used for the first cycle. CANDID includes two new elements that make it robust with respect to the presence of artifacts in the input data, i.e. network-anchoring and constraint-combination, which have a key role in de novo protein structure determinations for the successful generation of the correct polypeptide fold by the first CANDID cycle. Network-anchoring makes use of the fact that any network of correct NOE cross-peak assignments forms a self-consistent set; the initial, chemical shift-based assignments for each individual NOE cross-peak are therefore weighted by the extent to which they can be embedded into the network formed by all other NOE cross-peak assignments. Constraint-combination reduces the deleterious impact of artifact NOE upper distance constraints in the input for a protein structure calculation by combining the assignments for two or several peaks into a single upper limit distance constraint, which lowers the probability that the presence of an artifact peak will influence the outcome of the structure calculation. CANDID test calculations were performed with NMR data sets of four proteins for which high-quality structures had previously been solved by interactive protocols, and they yielded comparable results to these reference structure determinations with regard to both the residual constraint violations, and the precision and accuracy of the atomic coordinates. The CANDID approach has further been validated by de novo NMR structure determinations of four additional proteins. The experience gained in these calculations shows that once nearly complete sequence-specific resonance assignments are available, the automated CANDID approach results in greatly enhanced efficiency of the NOESY spectral analysis. The fact that the correct fold is obtained in cycle 1 of a de novo structure calculation is the single most important advance achieved with CANDID, when compared with previously proposed automated NOESY assignment methods that do not use network-anchoring and constraint-combination.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Bacterial Proteins / chemistry
  • Cattle
  • Killer Factors, Yeast
  • Models, Molecular
  • Molecular Chaperones*
  • Mycotoxins / chemistry
  • Nuclear Magnetic Resonance, Biomolecular / methods*
  • Peptide Fragments / chemistry
  • Plant Proteins / chemistry
  • Prions / chemistry
  • Protein Conformation
  • Proteins / chemistry*
  • Software*
  • Trans-Activators / chemistry

Substances

  • Bacterial Proteins
  • CopZ protein, Enterococcus hirae
  • Killer Factors, Yeast
  • Molecular Chaperones
  • Mycotoxins
  • PR1B1 protein, Lycopersicon esculentum
  • Peptide Fragments
  • Plant Proteins
  • Prions
  • Proteins
  • Trans-Activators
  • prion protein (121-231)