In: Chemistry
Your project should be on temperature adaptation or thermostability of proteins.
For a temperature adaptation project select an ectothermic organism with sub-species living in different climates. You will then focus on one protein, preferably an enzyme, for which you need to find amino acid sequences (one for each sub-species). You also need at least one crystal structure of this protein with enough sequence similarity to your selected amino acid sequences to make it feasible to create homology models. Finally, you will make comparisons between the structures or homology models while focusing on structural features that promote protein stability.
OR
For a thermostability project select a thermophilic organism (for example a bacterium thriving in a hot spring) and one protein, preferably an enzyme for which a crystal structure is already available. Next, you will need to search for crystal structures and/or amino acid sequences of your selected protein in other non-thermophilic organisms so that you can make comparisons between the crystal structures (or homology models). Aim to identify structural features that promote the thermostability of the protein from the thermophilic organism.
State the parameters of your project:
Species:
Protein(s):
PDB ID:
Confirm that you will have access to protein sequences and at least one protein structure (as a template for homology modeling) so that you can carry out your project. Post the access codes for the sequences and the PDB-ID(s) here:
Make a list of structural features that promote protein stability:
Note: Except the isocitrate dehydrogenase protein.
example like in the below articke
The ultimate goal of protein modeling is to predict a structure from its sequence with an accuracy that is comparable to the best results achieved experimentally. This would allow users to safely use rapidly generated in silico protein models in all the contexts where today only experimental structures provide a solid basis: structure-based drug design, analysis of protein function, interactions, antigenic behavior, and rational design of proteins with increased stability or novel functions. In addition, protein modeling is the only way to obtain structural information if experimental techniques fail. Many proteins are simply too large for NMR analysis and cannot be crystallized for X-ray diffraction.
Among the three major approaches to three-dimensional (3D) structure prediction described in this and the following two chapters, homology modeling is the easiest one. It is based on two major observations:
1
2 H O M O L O G Y M O D E L I N G
Percentage of identical residues |
100 |
||||||
90 |
||||||
80 |
||||||
70 |
Safe homology |
|||||
60 |
||||||
modeling zone |
||||||
50 |
||||||
40 |
||||||
30 |
||||||
20 |
Twilight zone |
|||||
10 |
||||||
0 |
50 |
100 |
150 |
200 |
250 |
|
0 |
Number of aligned residues
Figure 25.1. The two zones of sequence alignments. Two sequences are practically guaranteed to fold into the same structure if their length and percentage sequence identity fall into the region marked as ‘‘safe.’’ An example of two sequences with 150 amino acids, 50% of which are identical, is shown (gray cross).
Imagine that we want to know the structure of sequence A (150 amino acids long, Figure 25.2, steps 1 and 2). We compare sequence A to all the sequences of known structures stored in the PDB (using, for example, BLAST), and luckily find a sequence B (300 amino acids long) containing a region of 150 amino acids that match sequence A with 50% identical residues. As this match (alignment) clearly falls in the safe zone (Fig. 25.1), we can simply take the known structure of sequence B (the template), cut out the fragment corresponding to the aligned region, mutate those amino acids that differ between sequences A and B, and finally arrive at our model for structure A. Structure A is called the target and is of course not known at the time of modeling. In practice, homology modeling is a multistep process that can be summarized in seven steps:
At almost all the steps choices have to be made. The modeler can never be sure to make the best ones, and thus a large part of the modeling process consists of serious thought about how to gamble between multiple seemingly similar choices. A lot of research has been spent on teaching the computer how to make these decisions, so that homology models can be built fully automatically. Currently, this allows mod-elers to construct models for about 25% of the amino acids in a genome, thereby supplementing the efforts of structural genomics projects (Sanchez and Sali, 1999, Peitsch, Schwede, and Guex, 2000). This average value of 25% differs significantly
H O M O L O G Y M O D E L I N G 3
Template sequence B (arabinose-binding protein, 300 residues) Aligned region
Target sequence A (150 residues)
Step 1 and 2: Template identification and alignment
Step 4 and 5 - Loop and side chain modeling Step 3 - Backbone generation
Step 6 - Model optimization
Figure 25.2. The steps to homology modeling. The fragment of the template (arabinose-binding protein) corresponding to the region aligned with the target sequence forms the basis of the model (including conserved side chains). Loops and missing side chains are predicted, then the model is optimized (in this case together with surrounding water molecules). Images created with Yasara (www.yasara.com).
Step 7: Model Validation