In: Biology
ATGGTACTATCTCCTGAGTGCTGCAAGTTGTAACGGGCACCGCTGAGCCTGTTTCCCTTTGGAGCACTTC
TTATCTAGAAGCAGTGTTTAGTTTCTTCCAAACTGGGCCACTTCGTCCACCTACTCTGTTCTGAGTAAGG
AAACAGCCTCCAAGCATCAGCAGAGCCCAGATGAGCACGGGCCGCGGAGCCGCTTAGCAGTCTCCCGGGA
CCCAGCTCCGGAGGAGCCGCAAGCATGCACCCTGGGTTGTGGCTGCTCCTGGTTACGTTGTGCCTGACCG
AGGAACTGGCAGCAGCGGGAGAGAAGTCTTATGGAAAGCCATGTGGGGGCCAGGACTGCAGTGGGAGCTG
TCAGTGTTTTCCTGAGAAAGGAGCGAGAGGACGACCTGGACCAATTGGAATTCAAGGCCCAACAGGTCCT
CAAGGATTCACTGGCTCTACTGGTTTATCGGGATTGAAAGGAGAAAGGGGTTTCCCAGGCCTTCTGGGAC
CTTATGGACCAAAAGGAGATAAGGGTCCCATGGGAGTTCCTGGCTTTCTTGGCATCAATGGGATTCCGGG
CCACCCTGGACAACCAGGCCCCAGAGGCCCACCTGGTCTGGATGGCTGTAATGGAACTCAAGGAGCTGTT
GGATTTCCAGGCCCTGATGGCTATCCTGGGCTTCTCGGACCACCCGGGCTTCCTGGTCAGAAAGGATCAA
AAGGTGACCCTGTCCTTGCTCCAGGTAGTTTCAAAGGAATGAAGGGGGATCCTGGGCTGCCTGGACTGGA
TGGAATCACTGGCCCACAAGGAGCACCCGGATTTCCTGGAGCTGTAGGACCTGCAGGACCACCAGGATTA
CAAGGTCCTCCAGGGCCTCCTGGTCCTCTTGGTCCTGATGGGAATATGGGGCTAGGTTTTCAAGGAGAGA
AAGGAGTCAAGGGGGATGTTGGCCTCCCTGGCCCAGCAGGACCTCCACCATCTACTGGAGAGCTGGAATT
CATGGGATTCCCCAAAGGGAAGAAAGGATCCAAGGGTGAACCAGGGCCTAAGGGTTTTCCAGGCATAAGT
GGCCCTCCAGGCTTCCCGGGCCTTGGAACTACTGGAGAAAAGGGAGAAAAGGGAGAAAAGGGAATCCCTG
GTTTGCCAGGACCTAGGGGTCCCATGGGTTCAGAAGGAGTCCAAGGCCCTCCAGGGCAACAGGGCAAGAA
AGGGACCCTGGGATTTCCTGGGCTTAATGGATTCCAAGGAATTGAGGGTCAAAAGGGTGACATTGGCCTG
CCAGGCCCAGATGTTTTCATCGATATAGATGGTGCTGTGATCTCAGGTAATCCTGGAGATCCTGGTGTAC
CTGGCCTCCCAGGCCTTAAAGGAGATGAAGGCATCCAAGGCCTACGTGGCCCTTCTGGTGTCCCTGGATT
GCCAGCATTATCAGGTGTCCCAGGAGCCCTAGGGCCTCAGGGATTTCCAGGGCTGAAGGGGGACCAAGGA
AACCCAGGCCGTACCACAATTGGAGCAGCTGGCCTCCCTGGCAGAGATGGTTTGCCAGGCCCACCAGGTC
CACCAGGCCCACCTAGTCCAGAATTTGAGACTGAAACTCTACACAACAAAGAGTCAGGGTTCCCTGGTCT
CCGAGGAGAACAAGGTCCAAAAGGAAACCTAGGCCTCAAAGGAATAAAAGGAGACTCAGGTTTCTGTGCT
TGTGACGGTGGTGTTCCCAACACTGGACCACCCGGGGAACCAGGCCCACCTGGTCCATGGGGTCTCATAG
GCCTTCCAGGCCTTAAAGGAGCCAGAGGAGATCGAGGCTCTGGGGGTGCACAGGGCCCAGCAGGGGCTCC
AGGCTTAGTTGGGCCTCTGGGTCCTTCAGGACCCAAAGGAAAGAAGGGGGAACCAATTCTCAGTACAATC
CAAGGAATGCCAGGAGATCGGGGTGATTCTGGCTCCCAGGGCTTCCGTGGTGTAATAGGAGAACCAGGCA
AGGACGGAGTACCAGGTTTACCAGGTCTGCCAGGCCTTCCGGGTGATGGTGGACAGGGCTTCCCAGGTGA
AAAGGGGTTACCTGGACTTCCTGGTGAAAAAGGCCATCCTGGTCCACCTGGCCTCCCAGGAAATGGGTTA
CCAGGACTTCCTGGACCCCGTGGGCTTCCTGGAGATAAAGGCAAGGATGGATTACCGGGACAACAAGGCC
TTCCCGGATCTAAGGGAATCACCCTGCCCTGTATTATTCCTGGGTCATACGGTCCATCAGGATTTCCAGG
CACTCCCGGATTCCCAGGCCCTAAAGGGTCTCGAGGCCTCCCTGGGACCCCAGGCCAGCCTGGGTCAAGT
GGAAGTAAAGGAGAGCCAGGGAGTCCAGGATTGGTTCATCTTCCTGAATTACCAGGATTTCCTGGACCTC
GTGGGGAGAAGGGCTTGCCTGGGTTTCCTGGGCTCCCTGGAAAAGATGGCTTGCCTGGGATGATTGGCAG
TCCAGGCTTACCTGGTTCCAAGGGAGCCACTGGTGACATCTTTGGTGCTGAAAATGGTGCTCCGGGGGAA
CAAGGCCTACAAGGATTAACAGGGCACAAAGGATTTCTTGGAGACTCTGGCCTTCCAGGACTCAAGGGTG
TGCACGGGAAGCCTGGCTTACTAGGCCCCAAAGGTGAGCGGGGCAGCCCTGGGACACCAGGACAGGTGGG
ACAGCCAGGCACCCCAGGATCTAGTGGTCCATATGGCATCAAGGGCAAATCTGGGCTCCCAGGAGCACCA
GGCTTCCCAGGCATCTCAGGACATCCTGGAAAGAAAGGAACAAGAGGCAAGAAAGGTCCTCCTGGATCAA
TTGTAAAGAAAGGGCTGCCAGGGCTAAAAGGCCTTCCTGGAAATCCAGGCCTAGTAGGACTGAAAGGAAG
CCCAGGCTCTCCAGGGGTCGCTGGGTTGCCAGCCCTCTCTGGACCCAAGGGAGAGAAGGGGTCTGTTGGA
TTCGTAGGTTTTCCAGGAATACCAGGTCTGCCTGGTATTTCTGGAACAAGAGGATTAAAAGGAATTCCAG
GATCAACTGGAAAAATGGGACCATCTGGACGCGCTGGTACTCCTGGTGAAAAGGGAGACAGAGGCAATCC
GGGGCCAGTCGGAATACCTAGTCCAAGACGTCCAATGTCAAACCTTTGGCTCAAAGGAGACAAAGGCTCT
CAAGGCTCAGCCGGATCCAATGGATTTCCTGGGCCAAGAGGTGACAAAGGAGAGGCTGGTCGACCTGGAC
CACCAGGCCTACCTGGAGCTCCTGGCCTCCCAGGCATTATCAAAGGAGTTAGTGGAAAGCCAGGGCCCCC
TGGCTTCATGGGAATCCGGGGTTTACCTGGCCTGAAGGGGTCCTCTGGGATCACAGGTTTCCCAGGAATG
CCAGGAGAAAGTGGTTCACAAGGTATCAGAGGGTCGCCTGGACTCCCAGGAGCATCTGGTCTCCCAGGCC
TGAAAGGAGACAACGGCCAGACAGTTGAAATTTCCGGTAGCCCAGGACCCAAGGGACAGCCTGGCGAATC
TGGTTTTAAAGGCACAAAAGGAAGAGATGGACTAATAGGCAATATAGGCTTCCCTGGAAACAAAGGTGAA
GATGGAAAAGTTGGTGTTTCTGGAGATGTTGGCCTTCCTGGAGCTCCAGGATTTCCAGGAGTTGCCGGCA
TGAGAGGAGAACCAGGACTTCCAGGTTCTTCTGGTCACCAAGGGGCAATTGGGCCTCTAGGATCCCCCGG
ATTAATAGGACCCAAAGGCTTCCCTGGATTTCCTGGTTTACATGGACTGAATGGGCTTCCGGGCACCAAG
GGTACCCATGGCACTCCAGGACCTAGTATCACCGGTGTGCCTGGGCCTGCTGGTCTCCCTGGACCCAAAG
GAGAAAAAGGATATCCAGGAATTGGCATCGGAGCTCCAGGGAAGCCGGGCCTGAGAGGGCAAAAAGGTGA
TCGAGGTTTCCCAGGTCTCCAGGGCCCTGCTGGTCTCCCCGGTGCCCCAGGCATCTCCTTGCCCTCACTC
ATAGCAGGACAGCCTGGTGACCCCGGGCGACCAGGCCTAGATGGAGAACGAGGCCGCCCAGGCCCCGCTG
GACCCCCAGGTCCCCCTGGGCCATCCTCGAATCAAGGCGACACCGGAGACCCTGGCTTCCCTGGAATTCC
AGGTTTTTCTGGCCTCCCTGGAGAGCTAGGACTGAAAGGCATGAGAGGTGAGCCTGGCTTCATGGGGACT
CCAGGCAAGGTTGGGCCACCTGGAGACCCAGGATTTCCCGGAATGAAGGGGAAGGCAGGGGCAAGAGGCT
CTTCTGGCCTCCAAGGTGATCCTGGACAAACACCAACTGCAGAAGCTGTCCAGGTTCCTCCTGGACCCTT
GGGTCTACCAGGGATCGATGGCATCCCTGGCCTCACTGGGGACCCTGGGGCTCAAGGCCCTGTAGGCCTA
CAAGGCTCCAAAGGTTTACCTGGCATCCCCGGTAAAGATGGCCCCAGTGGGCTCCCAGGCCCACCTGGGG
CTCTTGGTGATCCTGGTCTGCCTGGACTGCAAGGCCCTCCAGGATTTGAAGGAGCTCCAGGGCAGCAAGG
CCCCTTCGGGATGCCTGGAATGCCTGGCCAGAGCATGAGAGTGGGCTACACGTTGGTAAAGCACAGCCAG
TCGGAACAGGTGCCCCCGTGTCCCATCGGGATGAGCCAGCTGTGGGTGGGGTACAGCTTACTGTTTGTGG
AGGGGCAAGAGAAAGCCCACAACCAGGACCTGGGCTTTGCTGGCTCCTGTCTGCCCCGCTTCAGCACCAT
GCCCTTCATCTACTGCAACATCAACGAGGTGTGCCACTATGCCAGGCGCAATGATAAATCTTACTGGCTC
TCCACTACCGCCCCTATCCCCATGATGCCCGTCAGCCAGACCCAGATTCCCCAGTACATCAGCCGCTGCT
CTGTGTGTGAGGCACCCTCGCAAGCCATTGCTGTGCACAGCCAGGACATCACCATCCCGCAGTGCCCCCT
GGGCTGGCGCAGCCTCTGGATTGGGTACTCTTTCCTCATGCACACTGCCGCTGGTGCCGAGGGTGGAGGC
CAGTCCCTGGTCTCACCTGGCTCCTGCCTAGAGGACTTTCGGGCCACTCCTTTCATCGAATGCAGTGGTG
CCCGAGGCACCTGCCACTACTTTGCAAACAAGTACAGTTTCTGGTTGACCACAGTGGAGGAGAGGCAGCA
GTTTGGGGAGTTGCCTGTGTCTGAAACGCTGAAAGCTGGGCAGCTCCACACTCGAGTCAGTCGCTGCCAG
GTGTGTATGAAAAGCCTGTAGGGTGGCACCTGCCACTCTGCCCCTTGCCCTCCCCTGCCCCTCACAACAG
TCACCTCACAAACCTGAATGGTCTGAAGAAGGAAGGCCTGAGCCCCTTTGCCTGTCAAGTTGTACATTGG
AGTCTCATTTGGGCTAGACTACCGGACACTCGTCACCCCAGCCCTCGGGTCCATAGAGATGAGCCCACCC
TGCTGAGATCTGCTGTCCTGTTTCTGTCAAGCTGGTGCTACTGTTTGATTTGGATGATTGTGTGACTATT
CATGGCTACCTCAGAAAGATTTGATGGGCCACAACTGTCTTAGACTGCTAGCTTTCTCCTTACCGTCTTG
ATCGGAAAGCTCTTCCGAATCGCTAATCAGTCATTTCTTCATGTACAGAGGTCAGCACACATTATTTGGC
TTAAACCAGAACCCAGTGTTTCCACACTTAAATTCTCTAACCGAATATTCATGGATGGCTCAAGTCTGCA
CAGAGCAAGTCCTCACTCTTCAAGGAGGCCCACTGTGTCTAGGCAGGCAAGAGAATTGAAATGAGGTGCC
ACCCAGTAGCCCAGAGTGAGCTTTAGCTCTAGAATGAGCAAGACTGGGCCCCACATGGCTTAGAGAGGCT
TGAAGGCCAGCAGCTGGGTTGGGGGTGGTGGTCATTAATGGCATATGGTCCTAGACAAACCATCTCCTCC
TTGCCGGCTCCCCCTCCAGCCAGAGACAGAGGATGTGGCCTGGTTCAAAGTAAAGCAGAGGATGCAACAA
ATGTGGCCAAGCTATCAAAGGAAATGAGAATGACAGCCTTTTTTCCTGGGCCAGAAGTAGAGGGGGTGGG
TGCGTAGGATGTGTGAGTTTTGCTTTTGACTCCAGGAACAAAAAGGTAAATCCCACATCCCAGTTTCTCA
GAAGTCCCTGTTTATTCCAATTGCCATCAGATGTGTGCAATGTGGCAAACTGAAGCTGCACAGTGTTGGT
TTCCTTGTATTCTGAGGATGTTAAAGACTTTGTTAAATGGTTATCCAATTGCTCTTTCACAGGTAGCCTA
TTAAACTATTTTAATATGTTTTTTTAAACCTCATAAAAATCTAGCACACTCTTCTCTTGAGCAGTTAGCA
GACCACCG
What are the different types of blast? What are their functions?
What is our protein?
Why did we pick this and not the others?
What web-tool can we use to find the function of this protein?
How many disulfide bonds are there?
What program will you use to find the information below?
What is the proteins accession number?
What does PDB stand for?
What is the PDB number for collagen?
What webtool can you use for macromolecular visualization?
What method was used to create this molecular model?
What are the ligands?
How many water molecules are there?
Post a snip of the parts of collagen that are polar?
Name one application in which bioinformatics is useful:
Basic Local Alignment Search Tool (BLAST) is an algorithm and program for comparing primary biological sequence information such as the nucleotides of DNA , RNA sequences or the amino-acid sequences of proteins referred as a query with the information provided in the databases. There are different classes or options of BLAST search , these are:
blastn: It searches nucleotides against nucleotides i.e when you put the DNA/nucleotide sequences as your query it will provide you the sequence match with the reference sequence in the database. For nucleotides there is a faster version of blastn called as megablast.
blastp: It searches proteins against proteins
blastx: It will search a protein database using your translated nucleotide query.
tBlastn: It will search a translated nucleotide database with your protein query.It will do the opposite of blastx.
tBlastx: It searches translated nucleotide databases with your translated nucleotide query.
Our protein here is collagen. If you put the above DNA sequence and blast it, the result shows sequence similarities with collagenIV of different species.
I think the reason to put sequence of collagen here is because it is a common protein. (in terms of databases it is profound)
Uniprot is a tool/program which provides the biological function of the proteins derived from the research literature. IThe Uniprot shows collagen has 5 disulphide bonds.
COL1A1 is its protein accession number.
The Protein Data Bank (PDB) is a database which provides the three-dimensional structural data of proteins and nucleic acids. The data is submitted by biologists and biochemists from around the world using X-ray crystallography, NMR spectroscopy or increasingly cryo-electron microscopy.
1BKV is the PDB number of collagen.
There are many macromolecular visualization tools like Rasmol, chime and MolMol. You can use these tools to check the structure, chemical bonds and polarity in collagen.
A ligand is a small molecule that is able to bind to a site in proteins by weak interactions such as ionic bonds, Van der Waals interactions, hydrogen bonds and hydrophobic effects.
Drug designing, molecular docking are some applications of bioinformatics.