Phaser Molecular Replacement Tutorial

Tutorial: Molecular Replacement with Phaser

All files for this tutorial are distributed from the Phaser web page http://www-structmed.cimr.cam.ac.uk/phaser/tutorial/phaser-mr-tutorial.tar.gz. The data for TOXD are also distributed with the CCP4 suite and are used for various CCP4 tutorials.

Tutorial 1: TOXD

This tutorial demonstrates the ensembling procedure in Phaser.

alpha-Dendrotoxin (TOXD, 7139Da) is a small neurotoxin from green mamba venom. You have two models for the structure. One is in the file 1BIK.pdb, which contains the protein chain from PDB entry 1BIK, and the other is in the file 1D0D_B.pdb, which contains chain B from PDB entry 1D0D. 1BIK is the structure of Bikunin, a serine protease inhibitor from the human inter-a-inhibitor complex, with sequence identity 37.7% to TOXD. 1DOD is the complex between tick anticoagulant protein (chain A) and bovine pancreatic trypsin inhibitor (BPTI, chain B). BPTI has a sequence identity of 36.4% to TOXD. Note that models making up an ensemble must be superimposed on each other, which has not yet been done with these two structures.

Use the SSM superpose option in coot to superimpose 1BIK on 1D0D_B, saving the resulting coordinates in 1BIK_on_1D0D.pdb.
Start the ccp4 GUI by typing ccp4i at the command line.
Make a new project called "phaser_tute" using the Directories&ProjectDir button on the RHS of the GUI. Set the "Project" to phaser_tute and "uses directory" to the directory where the files for this tutorial are located, and make this the "Project for this session of the CCP4Interface". You will then be able to go directly to this directory in the GUI using the pull-down menu that appears before every file selection.
Go to the Molecular Replacement module, in the yellow pull-down on the LHS of the GUI
Bring up the GUI for Phaser
All the yellow boxes need to be filled in.

It is a good idea to change the Ensemble id from the default.
It is also a good idea to fill in the TITLE.

When you have entered all the information, run Phaser.
Has Phaser solved the structure? What was the LLG of the best solution? What was the Z‑score of the best translation function solution?

The meaning of the Z‑score is given in the documentation http://www-structmed.cimr.cam.ac.uk/phaser/documentation/phaser-2.1_key.html

Look though the log file and identify the anisotropy correction, rotation function, translation function, packing, and refinement modes. Draw a flow diagram of the search strategy.
How many potential solutions did Phaser find or reject at each stage? What were the selection criteria for carrying potential solutions forward to the next step in the rotation and translation functions? How many other selection criteria could have been used, and what are they?

Use the documentation

Run Phaser again without using ensembling i.e. run two jobs, one using 1BIK only and the other using 1D0D only as models. What are the LLGs of the final solutions? What are the Z‑scores of the translation functions? Was ensembling a good idea?

Tutorial 2: BETA/BLIP

This tutorial demonstrates a difficult molecular replacement problem.

beta-lactamase (BETA, 29kDa) is an enzyme produced by various bacteria, and is of interest because it is responsible for penicillin resistance, cleaving penicillin at the beta-lactam ring. There are many small molecule inhibitors of BETA in clinical use, but bacteria can become resistant to these as well. Streptomyces clavuligerus produces beta-lactamase inhibitory protein (BLIP, 17.5kDa), which has been investigated as an alternative to small molecule inhibitors, as it appears more difficult for bacteria to become resistant to this form of BETA inhibition. The structures of BETA and BLIP were originally solved separately by experimental phasing methods. The crystal structure of the complex between BETA and BLIP has been a test case for molecular replacement because of the difficulty encountered in the original structure solution. BETA, which models 62% of the unit cell, is trivial to locate, but BLIP is more difficult to find. The BLIP component was originally found by testing a large number of potential orientations with a translation function search, until one solution stood out from the noise.

What is the best order in which to search for BETA and BLIP? Under what circumstances could the lower molecular weight search model be the easiest to find by molecular replacement?
What is the space-group recorded on the mtz file? If you had not solved this structure, would you know that this was the space-group? If not, what other space-group(s) must you consider?

Think about handedness (enantiomorphs)

Run Phaser for solving BETA/BLIP

Bring up the GUI for Phaser
All the yellow boxes need to be filled in.
Search for BETA and BLIP in the one job, using the search order you think will be best (see Question 1)
Select the space-group(s) you think must be considered (see Question 2)

Has Phaser solved the structure?

Look at the Z-scores for the rotation and translation functions

Which spacegroup was the solution in?
Look though the job.sum file and identify the anisotropy correction, rotation function, translation function, packing, and refinement modes, for the two search molecules, and all the space groups. Draw a flow diagram of the search strategy.
Why doesn't Phaser perform the rotation function in the two enantiomorphic space groups?
Which reflections in the data are particularly important for deciding the translational symmetry of the space-groups to search? Under what data collection conditions might you not have recorded these important reflections? Are there any other space-groups that you might want to consider when solving BETA/BLIP?
How big is the anisotropic correction for the data? How does this compare to TOXD?
Run Phaser again with the anisotropy correction turned off. What effect does this have on the structure solution?

Tutorial 3: Finding search models for TOXD

Tutorial Co-authored by Nick Keep

Models for molecular replacement are found by database searching. When solving a new structure, you would need to search with your sequence(s) against the protein databank to find a model structure. This can be done in a number of places.

The toxd sequence is

QPRRKLCILHRNPGRCYDKIPAFYYNQKKKQCERFDWSGCGGNSNRFKTIEECRRTCIG

Go to the search page on the RCSB website http://www.rcsb.org/pdb/advSearch.do
Find the option to search by sequence in the pull-dowm menu and paste the sequence into the query window, and run the search
The top hit will be the structure of toxd itself. In real life you would be disappointed to find your target structure at the top of the list! Download the following lower homology models onto your computer.
- 1DEN
- 1BF0
- 1TFX
There are a number of different ways to do this:
- Click on the text icon and then when the text window opens, use the right mouse to save page to disk, or
- Click on the icon with the pale blue download arrow, which will give you a gzipped pdb file. You will need to unzip this file before using it.
Take note of the homology and the chain(s) to which the homology applies
View the structures in a graphics program (e.g. rasmol, xtalview, coot)
For one of the PDB files you will need to select the chains you want for your model. You can just edit the PDB file in a text editor or you can use the Edit PDB File utility in ccp4i under Coordinate Utilities. Select pdbset as the program to perform a selection for output as a pdb file. You need to check select by chain name and enter the chain of choice.
It is a good idea to remove waters from a PDB file before using it as a model, as these are unlikely to be structurally conserved between model and target. Phaser removes these automatically, but other molecular replacment programs do not. Run pdbset on all three files as above but this time to Exclude water molecules.
View the edited structures in a graphics program (e.g. rasmol, xtalview, coot) to double check that you have the model you intended to get
Retry the molecular replacement with the above search models

How does Phaser handle the multiple models given for an NMR structure?

Is the success of molecular replacement related to sequence identity?

Consider the structure solution method. Why do NMR models make poor molecular replacement models?