Fitting to Segments
The fitting methods used by Segger are based on aligning a structure to single segmented regions or small groups of segmented regions. Thus the fitting process starts with segmenting the map. If you haven't already, it would be good to start by first looking at the Segmentation page.Segmenting the map
As an example we will use here the density map of GroEL at 4.2Å resolution. This map can be download from this link.
It is segmented here, using the Segment Map dialog at a threshold of 0.9, with 2 steps of size 7. The result is 14 regions, with each region corresponding to a single protein. The density map and segmented regions are shown below.
Side note: the segmentation threshold does not directly influence the number of regions obtained. At other thresholds, which show only the GroEL complex and not too much of the surrounding area, the same number of regions can be obtained by the smoothing and grouping method.
Side note: For this example we used large smoothing steps intentionally. When taking smaller steps, less accurate segmentation regions were obtained. This is likely due to noise in the density map. Applying more smoothing in the first step is helping to suppress this noise.
-
The Fit to Segments Dialog
The interface for fitting structures to segments can be opened from the Volume menu in Chimera, under Volume Data / Fit To Segments. This interface is shown below.
Next we'll obtain a structure of a single protein to fit to each segmented region shown above. -
Obtaining the structure of a single protein
The structure of GroEL can be downloaded from PDB:1xck. After downloading this structure, for easy access, it should be placed in the same folder as the GroEL density map which we segmented above (emd_5001.mrc). It can then be opened from the Fit to Segments dialog, by clicking on the drop-down menu to the right of Structure to fit. This drop-down menu shows all PDB files in the same folder as the density map selected in the Segment Map dialog (which should be emd_5001.mrc). Selecting 1XCK.pdb in the Structure to fit field will open the structure in Chimera. Once done, you will see the structure appear in the main Chimera window, as well as the Model Panel dialog if you have it open.
Next we want to isolate the structure of a single protein from this structure of the entire complex:
- One way of doing this is to select a single chain of this structure, e.g. chain A, by Ctrl+Click on any one atom or part of a ribbon, and then pressing the Up key on the keyboard. Once a single protein is selected, it can be saved to an individual file using File / Save PDB... (make sure Save selected atoms only is checked in the dialog that appears, and enter a name, e.g. 1xck_A.pdb. Then use the drop-down menu to the right of Structure to fit once again to select and open this structure by itself.
- Alternatively, you can select all chains in the 1xck.pdb structure other than A, and execute the command del sel in the command line under the Chimera main window (you may have to enable this from the Settings panel if it's not already enabled).
Once the structure of a single protein is selected in the drop-down menu to the right of Structure to fit, its principal axes can be shown from the Fit menu at the top of the Fit to Segments dialog, by selecting Show molecule axes. The structure and its principal axes are shown in the image below.
-
Fitting the structure by aligning it to a region
We can now align this structure to a segmented region in the density map. One way to do this is using the principal-axes transform. To illustrate this process:
- make sure the segmentation of GroEL is visible in the main Chimera window,
- select a single region corresponding to any one of the proteins
- choose Show only selected from the Regions menu in the Segment Map dialog (this will hide all other regions but the one selected)
- choose Make transparent from the Regions menu in the Segment Map dialog (this will let us see through the region's surface)
- choose Show axes for selected from the Regions menu in the Segment Map dialog
Here are the images of the structure and the region side by side, with their principal axes shown:
These two images show how the principal axes of the structure and of the region are roughly the same. We can use this to quickly align the structure to the region, by matching centers and principal axes. This is what we mean by principal axes transform. Once the alignment is done, the Fit in Map method implemented in Chimera is run to refine the alignment, an to produce the correct fit of the structure in the map.
To perform the alignment:
- make sure the structure is selected in the field to the right of Structure to fit in the Fit to Segments dialog
- press the Options button in the same dialog
- enter 5 to the right of Density map resolution:
- press the Fit button at the bottom of the same dialog.
Some notes:
- Showing the principal axes of the structure or the region is not required to complete the alignment process. They are computed automatically as needed. They are shown here for illustrative purposes.
- The principal axes in the image above are not pointing the same way. In fact, the principal axes are the eigenvectors of a covarience matrix. An eigenvector gives the direction of each axis, but the signs of these directions are ambiguous. When performing the alignment, the signs are flipped to generate 4 possible transforms. The alignment that gives the highest cross correlation is kept. Only non-reflecting transforms are considered, i.e. transforms in which either none or two of the three axes are flipped. Any other transform where odd number of axes are flipped result in reflections about one or more axes.
- The aligment process here only achieves a rigid fit. This assumes the structure of the molecule being fit should be the same in both the cryo-EM and crystallographic states for a good fit to be obtained. The latter may not always be true, for example some proteins may have different conformations under different conditions. In such cases, a flexible-fitting method, such as Direx or MDFF should be used.
Fitting options
The Fit to Segments dialog after the Options button is pressed is shown below:
- Density map resolution and grid spacing
During the fitting process, a density map is generated for the structure. This density map is used to compute a cross-correlation score for the fit. After the structure is aligned to one or more region, the structure and its density map are further moved so as to increase the cross-correlation score (the same thing that Volume Data / Fit in Map interface does). The density map of the structure should be generated at a resolution that is approximately the same as the reported resolution for the density map into which the structure is being fit, and roughly the same grid spacing. Before pressing the Fit button, enter this resolution (and optionally a grid spacing) in these fields. Note that by default, the resolution and grid spacing are set to 3*g and g respectively, where g is the grid spacing in the map selected in the Segment Map dialog. - Which regions to align the structure to:
- Combined selected regions: This is the default, and in this mode, the structure is aligned to the selected region. If more than one regions are selected, the structure is aligned to all the selected regions combined.
- Each selected region: The structure is aligned to each selected region. This means that more than one alignment will be made, and multiple fits will result. To save each fit, make sure to check the button to the left of Save each fit when doing multiple alignments. Note that if no regions are selected, and this mode is chosen, then the structure is aligned to each region in the current segmentation. This is useful when, much like in this example using GroEL, all proteins in the complex have approximately the same structure.
- Groups of regions including selected region: If you're not sure which group of regions will create the best alignment and thus the best fit, but you have an idea of which region might be involved in this group, select that region, and select this mode. When you press the Fit button, groups of regions will be automatically generated, including the selected region. The structure will be aligned to each group, and the best resulting fit will be displayed at the end.
- Groups of regions including all regions: If you're not sure which group of regions will create the best alignment and thus the best fit, and don't know that any region in particular is part of this group of regions, and select this mode. When you press the Fit button, groups of regions will be automatically generated from all segmented regions. The structure will be aligned to each group, and the best resulting fit will be displayed at the end. Note that if you start with many regions, many groups will be generated, and this process may take a while. After it is done, all the fits are recorded and saved. To see the best fits, enter the number of fits you would like to see to the right of Top number of fits to place, such as 1, 7, etc., and then press the Place button. The fits with the highest cross-correlations scores will be shown, and the structure will saved in each fit position.
- Alignment method
- Align by principal axes: This alignment method is described above and is quite fast, especially compared to an alternative, exhaustive search. However it may not always find the right fit. It is likely to fail for structures that have shapes for which the principal axes are completely ambiguous (e.g. sphere, rod, or cube-like shapes).
- Rotational search: Use this method when the principal axes method fails. It involves aligning the centers of the structure and region, groups of regions (as indicated by the choice under Which regions to align the structure to, and then searching only through different orientations. This is much like an exhaustive search, except only orientations (3 degrees of freedom) are searched through, rather than both position and orientation (6 degrees of freedom) as in exhaustive search. Thus it will still be somewhat faster than exhaustive search.
- Treat all sub-models as one structure
The PDB file format can support multiple models. For example, the entire ribosome can be contained in a PDB file only if stored as separate models. When such files are opened in Chimera, they appear to have the same model id, and each model within the file gets a sub-id. If this option is NOT checked, each model will be fitted separately, and will appear as different entries in the "Structure to fit" field. Otherwise, all models will be represented with a single entry, and will be fit as one structure.
- Optimize fits
When this option is checked, after generating each alignment to a region, the fit will be further optimized (i.e. the structure will be moved in the average gradient direction, until convergence is reached). When used, this process will typically produce better fits, but in lower resolution maps, it can result in large drifts. The entire fitting process will also be slower when this option is checked.
- Mask map with region to prevent large drifts
When this option is checked, before fitting, the entire map is masked with the selected region(s), so that voxels except the regions in the selected region(s) become 0. This is useful if 'Optimize fits' is enabled but large drifts are noticed. Checking this option will prevent such drifts, and the structure will remain in the same area as the region.
- Add all fits to list above
When this option is not checked, only the fit with the highes cross-correlation score will be added to the fit list. If it is checked, all the resulting fits are first clustered (to identify identical fits) and then added to the list.
- Compute atom inclusion score
When this option is checked, an atom-inclusion score will be computed for each fit added to the list. This score is the fraction of atoms in the structure that have positions at which the (interpolated) density value in the reference map is higher than the current iso-surface threshold set (in the Volume Viewer dialog) for the density map.
- Clashes with copies from symmetry
When this option is checked, an symmetric copies of the structure will be placed (if symmetry is detected in the map), and a clash score will be computed. The clash score is the fraction of atoms in the fitted structure that are within 3 Angstroms of any atom in a symmetric copy.
Aligning the structure to multiple regions
Back to the GroEL example, make sure no regions are selected, and then check the Save each fit when doing multiple alignments button. Also check the Each selected region under Which regions to align the structure to, and press the Fit button. Under this mode, the structure is aligned to each selected region. But if no regions are selected, then the structure is aligned to each region. The results are shown in the image below:
Last updated July 14, 2011
