Fitting Scores
When fitting a molecular structure (model) into a high resolution map (e.g. higher than 10Å), as was shown in the Fitting To Segments tutorial, it can be quite clear by visual inspection if the fit is a good one, based on whether elements such as alpha helices and beta sheets in the model match the higher density regions in the map.
At lower resolutions, validation by visual inspection alone can be a bit harder. To evaluate whether a fit is good in such cases, we can look at several scores:
- Density cross-correlation: This score is computed by first simulating a density map from the model, at approximately the same resolution as the density map. (In the Fitting To Segments tutorial, it was shown how this map is generated and used during the fitting process). The density cross-correlation score is the sum of all the density values in the simulated map multiplied by the density value at the same position in the map.
- Atom inclusion score: This score reflects how many atoms in the model are located at a position having density higher than a given density level.
- Clashes within symmetric arrangements: This score reflects how many atoms in the model clash with atoms from symmetric copies of the same model, in maps that have some type of symmetry. This score could also reflect clashes with other models fitted simultaneously within the map (e.g. as done by Multifit, or Chimera). Here the simpler case of fitting one structure at a time is considered.
Example: GroEL+GroES
As an example we will use here the density map of GroEL at 23.5Å resolution. This map can be download from this link.- Segmentation
The map was segmented using the Segment Map dialog at a threshold of 0.03, with 3 steps of size 1. The result is 35 regions. The regions were further grouped interactively to produce 21 regions. The density map is shown below on the left, the initial segmentation is shown in the middle, and the regions after grouping are shown on the right.
- Fitting
- Chain A from 1gru.pdb was separated out from the rest of the structure.
- With the structure of just chain A selected in the "Structure to fit" field in the Fit to Segments dialog, Density Resolution set to 23 Å and a single region selected as shown below in the image on the left, pressing the Fit button produced the results shown in the image in the middle. Note that due to the optimization procedure being enabled by default, the structure has drifted towards the higher densities in the middle of the map. Keeping in mind the structure of the entire complex, this is definitely not the right fit, yet using the cross-correlation score alone as an indicator might make us believe otherwise...
- With the option "Mask map with region to prevent large drifts" selected, the results after fitting are shown below in the image on the right. In this mode, the fit with the best cross-correlation score does reproduce the right fit.
- This fit shows up as an entry in the fits list just below the "Structure to fit" field (in the Fit to Segments dialog). After doing multiple fits, clicking on this entry will put the structure back in the corresponding position in the map, as long as the structure itself and the map are still open in your Chimera session.
-
Next to the option "Clashes with copies from symmetry:", pressing the "Show" button will do two things:
- Detect the symmetry of the map using the Chimera measure symmetry command.
- Place copies of the structure being fit based on this symmetry. In this case, the detected symmetry string "C7" is displayed, and the copies will be placed as shown below.
-
So far, only one fit score is displayed in the fits list, the cross-correlation. To compute the other scores, and to be able to compare them, make sure the
following options are checked:
- Alignment method: Rotational search
- Add all fits to list
- Compute atom inclusion score
- Clashes with symmetry
- The rotational search method is used instead of the faster alternative because this will generate more fits and hence more fit scores of the structure inside the map. This is useful in order to compare the best fit scores to other fit scores. Since the structure is still fit to the map by aligning it to the same region in different orientations, the fit scores will all be from the same area in the map, rather than from all over the map. This is good for several reasons, since it avoids problems due to density heterogeneity, and it avoids comparison to similar scores from other regions in the map due to symmetry.
- With all these options checked, after pressing the Fit button, after some computation time, the fits list will contain 7 entries, as shown below.
- Note that the number of fits generated was actually much higher, but after optimization, only 7 unique fits were identified. This optimization actually helps to make the scores more distinct from each other. Without optimization enabled, there would be more fits, and more scores, with the scores being much more similar. Without optimization, it's also typically harder to identify the best fit from scores alone.
-
Each fit entry now also contains, aside from the cross-correlation score:
- All-atom inclusion score (At Incl column heading)
- Backbone-inclusion score (BB Incl)
- Clashes with symmetric copies score (Clashes)
- The scores above are all given as fractions. For atom inclusion, it's the number of atoms inside the density map (above the current density threshold, set in the Volume Viewer dialog) divided by the total number of atoms. For clashes with symmetric copies, it's the number of atoms that are not within 3Å of an atom in a symmetric copy divided by the total number of atoms.
-
The entries in the fit list are still sorted by cross-correlation alone rather than by any of the other scores. To better compare the scores, you can export them
to a text file by choosing "Export fit scores" from the Fit menu at the top of the Fit to Segments dialog.
- The text file will have 4 columns, with each score in a different column, and each row representing a different fit.
- Z-scores are included on the last row. This score compares the top score in each column to the other fits, and it gives how many standard deviations this top fit is above the average of the other scores. Higher Z-scores may signify higher confidence in the best fit. Low Z-scores signify that the fits have similar scores regardless of position and orientation inside the map, and hence the best fit is not significantly better.
- Each score is plotted with bar graphs in the images below.
-
The Z-scores are:
- Cross-correlation: 4.714
- All-atom inclusion score: 2.045
- Backbone-inclusion score: 2.002
- Clashes with symmetric copies score: 0.910
- Note that the fit with highest cross-correlation score also has the highest atom inclusion scores. However, it doesn't have the highest Clash score. Find this fit in the list and show the symmetric copies using the "Show" button to see why. Hint:
- Ultimately, all scores can be considered, separately or together, when deciding which fit is the best.
Last updated July 14, 2011
