In: Computer Science
Download the Perl code of "Example 1: Conceptual Translation" from "Resources for Undergraduate Bioinformatics Instruction" at Wright State University http://birg.cs.wright.edu/resources/index.shtml. The program performs the conceptual translation on three forward reading frames from mRNA nucleotides to amino acids. The mRNA sequence is provided as the command line argument. Modify it to add the following features:
a. (5pts) If no argument is provided in the command line, prompt
the user to input a nucleotide sequence. Note that the modified
code should still take the command line argument if present.
done
b. (5pts) Accept lower case letters. done
c. (5pts) Accept a DNA sequence as the input. That is, thymine
(T or t), should also be valid in the input.
d. (10pts) Perform the conceptual translation on the three backward
reading frames in addition to on the forward reading frames.
Note that:
You are not allowed to modify the hash table provided in the code of Example 1.
The code is required to utilize the binding operator =~ whenever possible
instead of loops.
#!/usr/local/bin/perl
# Example program 1. Perform conceptual translation from
nucleotides
# to amino acids. Do this for three reading frames, skipping
the
# first 0-2 nucleotides to produce the reading frames.
# Define some constants that we'll need later.
$minlength = 3;
$readingframes = 3;
$unknown = "UNK"; # If some nucleotides are unknown, print
this.
# Define a hash to do matching/printing. This allows us to say
things
# like: $nucleohash{ "UUU" } and receive "Phe".
%nucleohash = ( "UUU", "Phe", "UUC", "Phe", "UUA", "Leu", "UUG",
"Leu",
"UCU", "Ser", "UCC", "Ser", "UCA", "Ser", "UCG", "Ser",
"UAU", "Tyr", "UAC", "Tyr", "UAA", "STP", "UAG", "STP",
"UGU", "Cys", "UGC", "Cys", "UGA", "STP", "UGG", "Trp",
"CUU", "Leu", "CUC", "Leu", "CUA", "Leu", "CUG", "Leu",
"CCU", "Pro", "CCC", "Pro", "CCA", "Pro", "CCG", "Pro",
"CAU", "His", "CAC", "His", "CAA", "Gln", "CAG", "Gln",
"CGU", "Arg", "CGC", "Arg", "CGA", "Arg", "CGG", "Arg",
"AUU", "Ile", "AUC", "Ile", "AUA", "Ile", "AUG", "Met",
"ACU", "Thr", "ACC", "Thr", "ACA", "Thr", "ACG", "Thr",
"AAU", "Asn", "AAC", "Asn", "AAA", "Lys", "AAG", "Lys",
"AGU", "Ser", "AGC", "Ser", "AGA", "Arg", "AGG", "Arg",
"GUU", "Val", "GUC", "Val", "GUA", "Val", "GUG", "Val",
"GCU", "Ala", "GCC", "Ala", "GCA", "Ala", "GCG", "Ala",
"GAU", "Asp", "GAC", "Asp", "GAA", "Glu", "GAG", "Glu",
"GGU", "Gly", "GGC", "Gly", "GGA", "Gly", "GGG", "Gly" );
# Retreive and check the command line parameter.
$input = @ARGV[0];
if ( length( $input ) < $minlength )
{
printf("Please enter the nucleotide String: ");
$input = <>;
# printf( "$0: Place the nucleotide string on the commmand
line.\n\n" );
# exit( 1 );
} # if
$input = uc($input);
printf( "Nucleotide sequence: $input\n\n" );
# Run through all 3 possible reading frames, skipping the first
letter
# or two for frames 1 and 2.
for ( $i = 0; $i < $readingframes; $i++ )
{
printf( "Reading frame $i: " );
# Find out how many 3-letter sequences remain, after skipping
0-2
# for the reading frame, and loop through all of these
sequences.
$len = int( length( substr( $input, $i ) ) / 3 );
# printf( "length $len: \n" );
$len2 = int( length( substr( $input, 3-$i ) ) / 3 );
#printf( "length $len2: \n" );
for ( $j = 0; $j < $len; $j++ )
{
# Take the current 3-letter sequence, look up the
corresponding
# amino acid. If it isn't in the hash table, it is unknown.
$nuc = substr( $input, $i + $j * 3, 3 );
print("nuc value: $nuc\n");
if ( defined( $nucleohash{ $nuc } ) )
{
#printf("defined");
$aa = $nucleohash{ $nuc };
} # if
else
{
#printf("undenfined\n");
$aa = $unknown;
} # else
printf( "$aa " );
} # for j
for ( $j = 0; $j < $len2; $j++ )
{
# Take the current 3-letter sequence, look up the
corresponding
# amino acid. If it isn't in the hash table, it is unknown.
$nuc = substr( $input, $i + $j * 3, 3 );
print("nuc value2: $nuc\n");
if ( defined( $nucleohash{ $nuc } ) )
{
# printf("defined");
$aa = $nucleohash{ $nuc };
} # if
else
{
# printf("undenfined\n");
$aa = $unknown;
} # else
printf( "$aa " );
} # for j
printf( "\n" );
} # for i
printf( "\n" );
ouput:
Note: Question C, I did get it correctly. Please note that C is not implemented. I'm sure you will be able to do it. Thanks!!