  • Steffen Lindert
  • Tommy Hofmann
  • Nils Woetzel
  • Jens Meiler

We developed an algorithm, EM-Fold, that predicts secondary structure elements from the primary sequence of the protein and folds these into identified regions corresponding to secondary structure in the density map. So for the purpose of this challenge we assumed that the provided density maps had been segmented into small density maps that only contain the individual chains of the proteins. We then identify positions of secondary structure elements in the map. Then EM-Fold (S. Lindert et al., Structure 17, 990 (Jul 15, 2009)) was used to assemble predicted SSEs into these density rods and refine positions of SSEs in these density rods. Finally Rosetta (F. DiMaio et al., J Mol Biol 392, 181 (Sep 11, 2009)) was used to build loops and side chains as well as refine further the best scoring models from EM-Fold. For this a three step protocol was used that identified the regions that agree least with density and then rebuilds coordinated for these. Due to time constraints for the submission of initial results we already submitted the Rosetta models after the first of the three iterative steps of refinement. We report rmsds of our models to the native chain (over full lenght protein and over the helical residues only).

We worked on map of Thermus thermophilus (70S) ribosome (6.4 A Resolution, emd_5030.map). We built a model for chain R correponding to protein 3FIN.pdb. This chain contains 117 residues, 4 distinct helices and two pronounced beta strands. Attached is our best model after the first round of Rosetta refinement. It exhibits a rmsd to native of 4.6 A, with a rmsd of 3.7 A over SSE residues only. The rmsd over full length is good and the rmsd over SSE residues is good. This illustrates that EM-Fold correctly assembles regions of high SSE content, but Rosetta understandably has a very hard time building large missing regions of the protein that even contain some smaller SSEs that we didn't pick up.


