In: Biology
Q2.
(a) What are the differences between an energy based ligand
design method such as GRID and a
knowledge based one like LUDI? What are the requirements of a
knowledge based ligand design
method?
(b) Interpret the following QSAR equation: log (1/C)=k 1 ?-k 2 ?
2 + k 3 ?. How are factorial design
methods using in QSAR compound selection? Briefly outline how
multiple linear regression analysis is
used in the derivation of a QSAR equation. How is cross-validation
used for checking the quality of a
regression based QSAR model?
Q3.
(a) Use the example of molecular docking of the antibiotic
netropsin to DNA to distinguish
quantitatively the differences between steepest descent and
conjugate gradient methods for initial
refinement and stringent minimization. What qualitative conclusions
can be drawn about the efficacy
of these two minimization techniques with respect to this docking
experiment?
(b) Using two
examples, explain the thermodynamic differences between the
Molecular Dynamics and Monte Carlo
methods. What are the advantages to choosing periodic boundary
conditions in ANY molecular
simulation of a macromolecule? Use a diagram to plot 4 such
periodic cell shapes. What important
class of applied molecular simulations have benefitted from the
usage of periodic boundary
Answer 2 a)
Knowledge based:
Knowledge-based scoring functions use statistical analysis of ligand-protein complexes crystal structures to obtain the interatomic contact frequencies and/or distances between the ligand and protein. They are based on the assumption that the more favorable an interaction is, the greater the frequency of occurrence will be. These frequency distributions are further converted into pairwise atom-type potentials. The score is calculated by favoring preferred contacts and penalizing repulsive interactions between each atom in the ligand and protein within a given cutoff.
The appeal of knowledge-based functions is computational simplicity, which can be exploited to screen large compound databases. They can also model some uncommon interactions like sulphur-aromatic or cation-?, which are often poorly handled in empirical approaches.
Energy based:
One of the most widely used software tools for the structure based de novo ligand design is the energy based GRID program. GRID can be used to create 3D-grating for the binding site of the design target. Various probe group can then be placed at the vertices of the grid and the interaction energy between the probe and specific chemical groups on the protein target can be calculated using empirical energy functions.Grids produced for each probe with an energy value assigned at each vertex then can be used to identify the most favorable locations in which to position particular functional groups for ligand design. These functional group can then be connected together to form a molecular scaffold.
The scaffold can serve as a template for the creation of real compounds through the attachment of additional chemical substituents that have the suitable sizes and chemical properties to fill the extra empty space in and around the active site.
REQUIREMENTS
The structure-based drug design works if you know the structure of the target and the ligand-based drug design is used when you do not know the structure of the target and their are known ligand with known potency.
Answer b)
HANSCH ANALYSIS: Proposed that drug action could be divided into 2 stages: 1) Transport & 2) Binding Each of these stages depend upon the physical and chemical properties of the drug. Log 1/C = k1P = k 2 P 2 + k 3s + k 4Es + k 5 Look at size and sign for each component of the equation. Values of r <<0.9 indicate equation not reliable Accuracy depends on using enough analogs, accuracy of data, & choice of parameters Applications: used to predict the activity of an as yet unsynthesized analouge.
Determination coefficient (R2 ):
The multiple linear regression method calculates QSAR equations by performing standard multivariable regression calculations using multiple variables in a single equation. When you use multiple linear regression, you assume that the variables are independent (not correlated). Also, to minimize the possibility of chance correlations, the number of independent variables initially considered should not be more than one-fifth the number of compounds in the training sets -- a warning message appears if this happens. When the number of independent variables is greater than the number of observations (rows), multiple linear regression cannot be applied.
To select the multiple linear regression method
Select LINEAR from the Statistical Method popup on the Statistical Method Preferences panel or from the Method popup at the top of the study table.