Question

In: Computer Science

iam asked to use bioinformatics to predict the structure of a new protein. what type of...

iam asked to use bioinformatics to predict the structure of a new protein. what type of tertiary structure prediction method should i try in priority and why? under which circumstances should i use an ab initio approach?

Expert Solution

1. Conformation initialization

The starting point (input) of protein structure prediction is the one-dimensional amino acid sequence of target protein and the ending point (output) is the model of three-dimensional structures. The theoretically possible steric conformation for a protein sequence is almost infinite, but the native one for most protein is unique. It is very difficult to fold a protein from its amino acid sequence alone. First, we are still unable to construct a sufficiently accurate force field that can guide the target sequence folding in the right direction; second, the amount of computation involved in such a vast conformational search process can easily go beyond the existing computing ability.

However, there is no guarantee that the satisfactory structural templates for any target protein can always be found. The template-free methods are the best choice for the hard target proteins of which no satisfactory template can be identified. It is the most straightforward way to generate the initial conformation of target protein by random; but in this way the burden of conformational search would be very heavy. Along with the inadequacy of current force field, it’s extremely difficult to accomplish the simulation process with such huge conformational change. In fact, the complex and multilevel nature of protein structure provides us with more choices.

2. Conformational search

After the initial conformation is constructed, we can continue to run simulation with the guide of a certain force field to search for near-native conformations step by step. As a typical biological macromolecule, protein consists of thousands of atoms and its conformational degrees of freedom are huge. Therefore, a simplified representation of protein conformations becomes particularly crucial for speeding up the simulation of protein folding process. In fact, the structural template identified by sequence alignment is already a reduced conformation with only backbone or Cα-atoms, because sequence alignment is actually in residue-level and the matches of different residues make the side chain conformations from template unusable to target protein. Currently almost all protein structure assembly simulation methods do conformational search based on a certain kind of simplified representation. For example, each residue can be represented only by its Cα-atom and the virtual center of side chain, or the entire backbone conformation can be represented by a series of dihedral angles.

3. Structure selection

Following the conformational search, a large number of structures of target protein are generated. One of the unsolved issues in both molecular dynamics simulation and Monte Carlo simulation is that the conformations are often trapped at the local minimal state. Even with the global minimal state identified, the conformation is not necessarily corresponding to the one closest to native state because of the inadequacies of force field. Thus, the common procedure during simulation is to regularly output lower energy intermediate structures for subsequent conformational screening. The key factor of structure selection is the assessment method for distinguishing native-like structures from nonnative ones. There is a specific prediction category in CASP for assessing the methods of structural quality assessment.It should be noted that the methods for structure selection may be designed specifically for assessing the reduced structural models corresponding to the simplified representation adopted during conformational search. It is an important research direction in protein structure prediction to develop methods of structural quality assessment based on all kinds of ideas and techniques

4. All-atom structure reconstruction

Since most of prediction methods adopt simplified protein representation for conformational search, so far what we have obtained are just one or several reduced structural models. The all-atom structure should be reconstructed based on the reduced models. The process of all-atom reconstruction varies a lot for reduced models based on different protein representation. Some prediction methods adopt the representation of “Cα atom” plus “virtual center of side chain”, where the “virtual center of side chain” only acts as an assistant for determining the position of Cα atom during conformational search and the output structure contains only Cα atoms. In that case, the reconstruction process is usually divided into two separate steps. The first step is to rebuild the backbone atoms (C N and O) based on the position of Cα atoms, which is the primary function of many methods developed specifically for all-atom reconstruction, such as SABBAC, BBQ, PULCHRA and REMO.All these methods depend on the backbone fragments cut from experimental structures. For example, the backbone isomer library built by REMO contains 528798 fragments with four consecutive residues which are collected from 2561 protein chains in PDB. The second step is to rebuild the side chain for every residue.

5. Structure refinement

Although the complete structure of the target protein has been obtained by the previous steps, the structural quality is usually not very good, which may owe to the defects of the force field, conformational search or all-atom reconstruction. The process of structure selection by clustering method may also bring some local structural issues if the structures of cluster centroid are used.⁵⁴ Therefore, it is almost a routine step to further refine the structure after all-atom reconstruction. Since the structural issues in reduced model can directly affect the quality of final allatom structure, some methods combine the procedures of all-atom reconstruction and refinement.⁵⁵ They refine the reduced model (such as backbone structure) and all-atom structure separately according to the reconstruction schedule.

Structure refinement also requires a force field to conduct molecular dynamics simulation or Monte Carlo simulation, but this procedure is quite different from the previous step of conformational search. The aim of conformational search in structure assembly simulations is to determine the backbone structure of the target protein, which actually sacrifices the structural details to ensure the search efficiency. However, the main purpose of structure refinement is to improve the quality of allatom structure (especially local structure) where only small change is conducted in backbone conformation.

Two Categories of Protein Structure Prediction Methods:

1. Template-based methods

For most target proteins, the desirable structural template can be identified from PDB by sequence alignment or threading method. Since the conformational information from template is much more reliable than that from elsewhere (especially when the target protein and the template are highly homologous), the prediction accuracy of template-based method is generally higher than other methods, which makes it highly popular in practical applications.

2. Template-free methods

Currently most structure prediction methods rely on the information provided by the experimental structures (the most direct way is the use of structural templates), which is not helpful for us to explore and understand the essential law of protein folding. The development of template-free methods is driven not only by the practical application (not all target proteins can find a satisfactory template in PDB), but also by the basic scientific problem of protein folding code. Although the template-free methods commonly exploit the information from known structures as well, their development can better reflect the theoretical and technical level of protein structure prediction than template-based methods.

venereology answered 3 years ago

what are the 4 levels of protein structure and what type of bonding leads to the...

what are the 4 levels of protein structure and what type of bonding leads to the formation of each of these types of protein structure?

1) What is one method that researchers can use to find the structure of protein with...

1) What is one method that researchers can use to find the structure of protein with atomic level resolution? Also mention its primary strengths and weakness. 2) Describe (in detail) an α helix or a β sheet. 3) What functional groups are on Lys-Arg-Glu and what is the pI? 4) As a biochemist, what is the value of using proteases like trypsin?

what are the similarities between a secondary structure and a tertiary structure of a protein?

A picture of the crystal structure of Human P-type ATpase protein and a description of its...

A picture of the crystal structure of Human P-type ATpase protein and a description of its 2°, 3° and 4° structure

Use the group structure of the periodic table and the octet rule to predict the empirical...

Use the group structure of the periodic table and the octet rule to predict the empirical formulas and the molecular geometries for the binary compounds that hydrogen forms with the elements polonium, chlorine, arsenic, and tin. element compound formula geometry polonium _________linearbenttrigonal pyramidaltetrahedral chlorine _________linearbenttrigonal pyramidaltetrahedral arsenic _________linearbenttrigonal pyramidaltetrahedral tin _________linearbenttrigonal pyramidaltetrahedral

What is the structure of the haemoglobin protein? (please break down to primary structure, secondary structure,...

What is the structure of the haemoglobin protein? (please break down to primary structure, secondary structure, tertiary structure, and quaternary stucture (if it has one).

Sally-Anne is considering starting a business and is not sure what type of structure to use....

Sally-Anne is considering starting a business and is not sure what type of structure to use. She estimates that the annual turnover for the 2020 tax year will be $525,000 and the (taxable) net profit will be $183,000. Both Sally-Anne and her husband have private health insurance. Calculate (showing workings) the tax she would have to pay if she sets up the business as: a. A sole trader (no other income) b. A partnership 50/50 with her husband (neither earn...

Where is the ACE2 protein found? Describe its structure and function. Describe the type and nature...

Where is the ACE2 protein found? Describe its structure and function. Describe the type and nature of the weak interactions identified between the RBD and ACE2 proteins.

In preparing protein sample for crystal structure : (a) Why is it necessary to use large...

In preparing protein sample for crystal structure : (a) Why is it necessary to use large amount of crystallisation solutions for the vapour diffusion experiment ? (b) Give reason why sitting drop format may be used for the experiment .(c) Explain why it will not be necessary to make selenium derivatives of enzyme used for the crystal trials .

• What determines protein structure? • Describe the experiment that lead to this conclusion? • What...

• What determines protein structure? • Describe the experiment that lead to this conclusion? • What type of chemical interactions are involved in the tertiary protein structure?