In: Biology
Describe (i) how the process of molecular evolution generates protein
families, and (ii) how the amino acid sequence alignments based on sequence similarities for such
members of a protein family can be used to identify conserved regions of a protein and construct a
phylogenetic tree of such members of a protein family. (iii) Will amino acids at the active site of an
enzyme tend to be conserved in a protein family? Why?
(i) Molecular evolution is the process of change in the sequence composition of cellular molecules such as DNA, RNA, and proteins across generations. Protein evolution describes the changes over time in protein shape, function, and composition and is studied by comparing the sequences and structures of proteins from many organisms representing distinct evolutionary clades. Therefore the process of molecular evolution generates protein families.
(ii) The evolutionary rate (the selection intensity) at a particular amino acid residue may vary during evolution. It means that during sequence evolution an amino acid residue can be changed from very conserved to highly variable, and vice versa. After gene duplication, One site becomes more conserved in one gene copy as a con-sequence of acquired new functions.
If the sequences/structures of two proteins are similar indicating that the proteins diverged from a common origin, these proteins are called as homologous proteins. Homologous proteins that exist in two distinct species are called as orthologs. Whereas, homologous proteins encoded by the genome of a single species are called paralogs. Phylogenetic trees of proteins can be established by the comparison of sequence identities among protoeins. Therefore, site-specific rate changes are related to protein family evolution.
(iii) YES, the amino acids at the active site of an enzyme tend to be conserved in a protein family as conserved sequences are similar or identical sequences in nucleic acids (DNA and RNA) or proteins across species (orthologous sequences) or within a genome (paralogous sequences). Conservation indicates that a sequence has been maintained by natural selection. Natural selection would act to conserve and optimise protein sequences essential to life. The conservation of a sequence may then be inferred by detection of highly similar homologs over a broad phylogenetic range. Conserved sequences can be helpful when searching for more distantly related sequences. Homology search tools may take an individual nucleic acid or protein sequence as input, or use statistical models generated from multiple sequence alignments of known related sequences.