In: Chemistry
1. You are trying to determine the structure of the primase complex that is ~300kD. What technique might you use? Why?
2. You want a molecular picture of the Cas9 in complex with its targeting RNA.This protein is 165kD. What technique would you use? Why?
3. A new type of bacteria has been discovered living in a volcanic crater. From genome studies of this bacteria, a it seems this bacteria seems to feed on iron and sulfur. A gene has been identified that seems to have some similarity to an iron-processing enzyme in endolithic bacteria. The gene is question has approximately 20% protein sequence identity with this known protein. How would you determine the structure of this new protein?
Answer 1)
DnaG is a bacterial DNA primase and is encoded by the dnaG gene. The enzyme DnaG, and any other DNA primase, synthesizes short strands of RNA known as oligonucleotides during DNA replication. These oligonucleotides are known as primers because they act as a starting point for DNA synthesis. DnaG catalyzes the synthesis of oligonucleotides that are 10 to 60 nucleotides. The DnaG primase is a 581 residue monomeric protein with three functional domains, according to proteolysis studies. There is an N-terminal Zinc-binding domain (residues 1-110) where a zinc ion is tetrahedrally coordinated between one histidine and three cysteine residues, which plays a role in recognizing sequence specific DNA binding sites. The central domain (residues 111-433) displays RNA polymerase activities, and is the site of RNA primer synthesis. The C-terminal domain (residues 434-581) is responsible for the noncovalent binding of DnaG to the DnaB helicase protein.
Protein structure determination is crucial for a broad range of applications from fundamental biological research to next-generation drug development. Proteins required for analysis are first produce in large quantity using expression plasmid either in E.coli or reconstituted living systems. Genetic engineering and protein engineering techniques can be used to achieve expression and easy purification. Target protein is purified from the mixture by using different physicochemical properties of proteins I column chromatography (ion-exchange, affinity or hydrophobic column chromatography) and target protein is confirmed by SDS-PAGE method. And then it use for further 3D structure analysis.
Now protein in the question is primase complex with 300KD molecular weight, for structure determination we can perform crystallization and X-ray diffraction method. In this method protein in question is crystallized, Large number of conditions must be met in order to yield high quality crystals. Therefore development of automated crystallization robot system is an important technique allowing more efficient experiment. Then such high quality crystal with protein in regular array are irradiated with X-rays the electrons surrounding the atoms that make up the protein scatter the X-ray , this diffraction of X-rays produces a diffraction image . High quality crystal and intense low emission X-rays are require to obtain a high precision analysis. Since primase is a multi-domain protein its X-ray diffraction allows us ab initio structure prediction when it is attached to DNA and other proteins. Nowadays SAXS of biological materials has been rapidly evolving and promises to move structural analysis to a new level.
Answer2)
In the second question CAS9 is an endonuclease. Cas9 (CRISPR associated protein 9) is an RNA-guided DNA endonuclease enzyme associated with the CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) adaptive immunity system in Streptococcus pyogenes, among other bacteria. S. pyogenes utilizes Cas9 to memorize and later interrogate and cleave foreign DNA,such as invading bacteriophage DNA or plasmid DNA.
Here we can use SAXA/NMR data for structural analysis. NMR has significant advantages in that it can study proteins in solution near native physiological conditions, can obtain dynamic information for different regions and domains of a protein, and allow access to functionally very important information on protein flexibility, intrinsically unstructured regions, partial protein folding, and in some cases multiple accessible protein conformations. However, typical NMR structures are not as accurate as the best protein structures from X-ray crystallography, and currently the use of NMR for protein structure determination in solution has its upper size limits typically near 50–70 kDa proteins. Approaching this limit already requires the use of all modern NMR techniques, such as ultra-high field 800-1000 MHz magnets, isotopic labeling schemes, advanced pulse sequences and NMR electronics, and highest sensitivity CryoProbes™. Even with all these capabilities, solution NMR often lacks the ability to determine the exact global structure of larger molecular assemblies or multi-domain proteins. Solution-structure determination by NMR uses interatomic distances determined from the Nuclear Overhauser Effect (NOE) and torsion angles determined from Residual Dipolar Coupling (RDC). In theory, a complete set of NMR measurements can be used to uniquely determine the structures of labeled proteins. However, in practice many NMR measurements provide sparse RDC datasets that cannot be used to uniquely determine a structure, especially for larger proteins.
SAXS is an ideal complementary technique that can efficiently and effectively compensate incomplete NMR datasets of biological macromolecules. It has been shown by a number of researchers that SAXS data can improve the quality and accuracy of NMR structures and also potentially extends the capability of NMR to larger macromolecules. (reference: Structural characterization of proteins and complexes using small-angle X-ray solution scattering; Haydyn D. T. Mertensa and Dmitri I. Sverguna, aEuropean Molecular Biology Laboratory-Hamburg Outstation, c/o DESY, Notkestrasse 85, 22603 Hamburg, Germany;; Structural analysis of flexible proteins in solution by small angle X-ray scattering combined with crystallography, Susan E. Tsutakawa1, Greg L. Hura2, Ken A. Frankel3, Priscilla K. Cooper1, and John A. Tainer)
Question 3)
For protein in question 3 we can use bioinformatics techniques and some structure prediction online data base to predict the 3D structure of protein. Protein structure prediction by using bioinformatics can involve sequence similarity searches, multiple sequence alignments, identification and characterization of domains, secondary structure prediction, solvent accessibility prediction, automatic protein fold recognition, constructing three-dimensional models to atomic detail, and model validation. Not all protein structure prediction projects involve the use of all these techniques. A central part of a typical protein structure prediction is the identification of a suitable structural target from which to extrapolate three-dimensional information for a query sequence. For example, ExPAsy (Advanced Protein Secondary Structure Prediction), swiss model repository,string data base, IntAct, DIP, MINT) (reference: Bioinformatics methods to predict protein structure and function: A practical approach, Mol Biotechnol. 2003 Feb;23(2):139-66, Edwards YJ1, Cottage A)
The observation that similar protein sequences fold into similar three-dimensional structures provides a basis for the methods which predict structural features of a novel protein based on the similarity between its sequence and sequences of known protein structures. Similarity over entire sequence or large sequence fragment(s) enables prediction and modeling of entire structural domains while statistics derived from distributions of local features of known protein structures make it possible to predict such features in proteins with unknown structures. The accuracy of models of protein structures is sufficient for many practical purposes such as analysis of point mutation effects, enzymatic reactions, interaction interfaces of protein complexes, and active sites. Protein models are also used for phasing of crystallographic data and, in some cases, for drug design. By using models one can avoid the costly and time-consuming process of experimental structure determination.