Question

In: Computer Science

You shall implement a class named DnaSequence, which models a sequence of DNA. Requirements: You must...

You shall implement a class named DnaSequence, which models a sequence of DNA. Requirements:

You must implement all public constructors and methods described in this documentation.

Your class must have only 1 field, and it must be a private char[].

No print statements are allowed anywhere in your class.


The constructors require you to discard any character that is not 'A', 'C', 'G', or 'T'. This is very easily accomplished using a regular expression in String.replaceAll:

// returns a String containing "ACG" "A B C D E F G".replaceAll("[^ATCG]", "")

Here is a screenshot of the top of a partially redacted version of my DnaSequence.java file:

Testing

You'll probably want to create another class with a main method that tests your DnaSequence class, e.g.:


import java.util.Arrays; public class DnaSequenceTests { public static void main(String[] args) { DnaSequence seq = new DnaSequence(new char[] { 'G', 'A', 'T', 'T', 'A', 'C', 'A' }); assert "GATTACA".equals(seq.toString()); assert seq.baseAt(2) == 'T'; assert seq.gcContent() == 2.0 / seq.length(); seq.complement(0); assert seq.equals(new DnaSequence("CATTACA")); assert Arrays.equals(seq.nucleotideCounts(), new int[] { 3, 2, 0, 2 }); } }

Class DnaSequence

java.lang.Object

DnaSequence

public class DnaSequence
extends Object

Represents a simple DNA sequence.

Author:

A hardworking student for CS 12J, [email protected]

  • Field Summary

    Fields
    Modifier and Type Field Description
    private char[] dna

    The actual sequence of nucleotides is stored in this array, invisible to the outside world, only directly accessible within this class.

  • Constructor Summary

    Constructors
    Constructor Description
    DnaSequence​(char[] dna)

    Constructs a new DNA sequence from the contents of a char[].

    DnaSequence​(String dna)

    Constructs a new DNA sequence from the contents of a String.

  • Method Summary

    All MethodsInstance MethodsConcrete Methods

    Modifier and Type Method Description
    char baseAt​(int index)

    Returns the nucleotide base char value at the specified index.

    void complement​(int index)

    Mutator method that flips one base of this sequence to its complement.

    boolean equals​(DnaSequence that)

    Compares this sequence to another.

    double gcContent()

    Calculates and returns the GC-content of this DNA sequence.

    int hammingDistance​(DnaSequence that)

    Calculates and returns the Hamming distance between this DNA sequence and another.

    int length()

    Returns the length of this DNA sequence.

    boolean[] mutationPoints​(DnaSequence that)

    Calculates and returns where two DNA sequences of equal lengths differ.

    int[] nucleotideCounts()

    Calculates and returns the number of times each type of nucleotide occurs in this DNA sequence.

    DnaSequence reverseComplement()

    Calculates and returns the reverse complement of this DNA sequence as a new DnaSequence.

    String toString()

    Returns a string representation of this sequence (for example, "ATCCGTGGACT").

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
  • Field Details

    • dna

      private char[] dna

      The actual sequence of nucleotides is stored in this array, invisible to the outside world, only directly accessible within this class.

  • Constructor Details

    • DnaSequence

      public DnaSequence​(String dna)

      Constructs a new DNA sequence from the contents of a String. Discards any invalid characters in the string, e.g. any character that is not 'A', 'T', 'C', or 'G'.

      Parameters:

      dna - a string containing characters that represent a DNA sequence

    • DnaSequence

      public DnaSequence​(char[] dna)

      Constructs a new DNA sequence from the contents of a char[]. Discards any invalid characters in the string, e.g. any character that is not 'A', 'T', 'C', or 'G'. Does not modify nor retain a reference to the parameter array.

      Parameters:

      dna - an array containing char values representing a DNA sequence

  • Method Details

    • length

      public int length()

      Returns the length of this DNA sequence. The length is equal to the number of nucleotides in the sequence.

      Returns:

      the length of this DNA sequence

    • toString

      public String toString()

      Returns a string representation of this sequence (for example, "ATCCGTGGACT").

      Overrides:

      toString in class Object

      Returns:

      a String containing all nucleotides in this sequence, in order

    • equals

      public boolean equals​(DnaSequence that)

      Compares this sequence to another. The result is true if and only if this sequence represents the same sequence of nucleotides as the other.

      Parameters:

      that - the other sequence

      Returns:

      whether this sequence represents the same sequence of nucleotides as the other

    • baseAt

      public char baseAt​(int index)

      Returns the nucleotide base char value at the specified index. An index ranges from 0 to this.length() - 1. The first base value of the sequence is at index 0, the next at index 1, and so on, as in array indexing.

      Parameters:

      index - the index of the base value

      Returns:

      the base value at the specified index of this sequence. The first base value is at index 0.

    • nucleotideCounts

      public int[] nucleotideCounts()

      Calculates and returns the number of times each type of nucleotide occurs in this DNA sequence.

      Returns:

      an int array of length 4, where indices 0, 1, 2 and 3 contain the number of 'A', 'C', 'G' and 'T' characters (respectively) in this sequence

    • reverseComplement

      public DnaSequence reverseComplement()

      Calculates and returns the reverse complement of this DNA sequence as a new DnaSequence. In DNA sequences, 'A' and 'T' are complements of each other, as are 'C' and 'G'. The reverse complement is formed by reversing the symbols of a sequence, then taking the complement of each symbol (e.g., the reverse complement of "GTCA" is "TGAC").

      Returns:

      a new DnaSequence representing the reverse complement of this sequence

    • gcContent

      public double gcContent()

      Calculates and returns the GC-content of this DNA sequence. The GC-content of a DNA sequence is given by the percentage of symbols in the string that are 'C' or 'G'. For example, the GC-content of "AGCTATAG" is .375 (37.5%).

      Returns:

      the GC-content of this sequence, to double precision

    • hammingDistance

      public int hammingDistance​(DnaSequence that)

      Calculates and returns the Hamming distance between this DNA sequence and another. The Hamming distance between two sequences is the number of points in the sequences where the corresponding symbols differ. For example, the Hamming distance between "ATTATGC" and "ATGATCC" is 2.

      Parameters:

      that - the other sequence

      Returns:

      the Hamming distance between this sequence and the other, or -1 if the two sequences are of unequal length

    • mutationPoints

      public boolean[] mutationPoints​(DnaSequence that)

      Calculates and returns where two DNA sequences of equal lengths differ. For example, given sequences "ATGT" and "GTGA", the result should be array [true, false, false, true ].

      Parameters:

      that - the other sequence

      Returns:

      an array of boolean values, of length equivalent to both sequences' lengths, containing true in each index where the two sequences differ, and false where they do not differ. If the two sequences are of unequal length, this method returns {null}.

    • complement

      public void complement​(int index)

      Mutator method that flips one base of this sequence to its complement. In DNA sequences, 'A' and 'T' are complements of each other, as are 'C' and 'G'. For example, if feline is a reference to a DnaSequence object representing the sequence "GATCAT", then subsequent to invocation feline.complement(0), the represented sequence shall be "CATCAT".

      Parameters:

      index - the index at which to perform the complement

Solutions

Expert Solution

The code for the above class is as follows:

import java.util.Arrays; 

class DnaSequence {
        private char[] dna;

        DnaSequence(char[] dna) {
                this.dna = new String(dna).replaceAll("[^ATCG]", "").toCharArray();
        }

        DnaSequence(String dna) {
                this.dna = dna.replaceAll("[^ATCG]", "").toCharArray();
        }

        //getters 
        public char baseAt(int index) {
                return dna[index];
        }

        public void complement(int index) {
                char c = baseAt(index);
                switch(c) {
                        case 'A':
                                dna[index] = 'T';
                                break;
                        case 'T':
                                dna[index] = 'A';
                                break;
                        case 'C':
                                dna[index] = 'G';
                                break;
                        case 'G':
                                dna[index] = 'C';
                                break;
                }
        } 


        public char complement(int index, char[] array) {
                char c = array[index];
                char result;
                switch(c) {
                        case 'A':
                                result = 'T';
                                break;
                        case 'T':
                                result = 'A';
                                break;
                        case 'C':
                                result = 'G';
                                break;
                        case 'G':
                                result = 'C';
                                break;
                        default:
                                result = array[index];
                                break;
                }

                return result;
        }


        public char[] getDna() {
                return this.dna;
        }

        public boolean equals(DnaSequence that) {
                return this.equals(that);
        }

        public double gcContent() {
                int count = 0;
                for(char c: dna ) {
                        if(c == 'C' || c == 'G') {
                                count++;
                        }
                }
                return count * 100/dna.length;
        }

        public int hammingDistance(DnaSequence that) {
                int length = 0;
                int hammingDistance  = 0;
                if(dna.length != that.length()) {
                        return -1;
                }else {
                        length = dna.length;
                }

                char[] anotherDna = that.getDna();

                for (int i =0; i < length; i++) {    

                if (dna[i] != anotherDna[i]) {
                    hammingDistance++;
                }
            }

            return hammingDistance;  
        }

        public int length() {
                return dna.length;
        }

        public boolean[] mutationPoints(DnaSequence that) {
                char[] anotherDna = that.getDna();
                boolean[] result;
                if(dna.length != anotherDna.length) {
                        return null;
                } else {
                        result = new boolean[dna.length];
                }

                for(int i = 0; i< dna.length; i++) {
                        if(dna[i] == anotherDna[i]) {
                                result[i] = false;
                        } else {
                                result[i] = true;
                        }
                }
                return result;
        }

        public int[] nucleotideCounts() {
                int countA = 0;
                int countC = 0;
                int countG = 0;
                int countT = 0;
                int[] result = new int[4];
                for(int i =0; i< dna.length; i++ ) {
                        if(dna[i] == 'A') countA++;
                        else if(dna[i] == 'C') countC++;
                        else if(dna[i] ==  'G') countG++;
                        else if(dna[i] == 'T') countT++;
                }
                result[0] = countA;
                result[1] = countC;
                result[2] = countG;
                result[3] = countT;

                return result;
        }


        public char[] reverse(){
                char result[] = new char[dna.length];
                int count = 0;

                for(int i = dna.length-1; i >= 0; i++) {
                        result[count++] = dna[i];
                }

                return result;
        }


        public DnaSequence reverseComplement() {

                char[] reverse = reverse();

                for(int i = 0; i < reverse.length; i++) {
                        reverse[i] = complement(i, reverse);
                }

                DnaSequence newDna = new DnaSequence(reverse);
                return newDna;
        }

        public String toString() {
        return new String(dna);
    }

}

public class DnaSequenceTests { 
        public static void main(String[] args) { 

                DnaSequence seq = new DnaSequence(new char[] { 'G', 'A', 'T', 'T', 'A', 'C', 'A' }); 
                assert "GATTACA".equals(seq.toString()); 
                assert seq.baseAt(2) == 'T'; 
                assert seq.gcContent() == 2.0 / seq.length(); 
                seq.complement(0); 
                assert seq.equals(new DnaSequence("CATTACA")); 
                assert Arrays.equals(seq.nucleotideCounts(), new int[] { 3, 2, 0, 2 });

        } 
}

Just to ensure that after performing the test cases, I didn't receive the error I am attaching the output image

If you have any doubts, you can ask in the comments section. Thank you.


Related Solutions

You shall implement six static methods in a class named BasicBioinformatics. Each of the methods will...
You shall implement six static methods in a class named BasicBioinformatics. Each of the methods will perform some analysis of data considered to be DNA. DNA shall be represented arrays of chars containing only the characters A, C, G and T. In addition to the six methods you will implement, six other methods exist in the class, which use Strings instead of char arrays to represent DNA. These other methods simply invoke the methods you are to implement, so all...
1.Implement the generic PriorityQueueInterface in a class named PriorityQueue. Note: it must be named PriorityQueue The...
1.Implement the generic PriorityQueueInterface in a class named PriorityQueue. Note: it must be named PriorityQueue The priority queue MUST be implemented using a linked list. 2 test program checks that a newly constructed priority queue is empty o checks that a queue with one item in it is not empty o checks that items are correctly entered that would go at the front of the queue o checks that items are correctly entered that would go at the end of...
#data structure 1.Implement the generic PriorityQueueInterface in a class named PriorityQueue. Note: it must be named...
#data structure 1.Implement the generic PriorityQueueInterface in a class named PriorityQueue. Note: it must be named PriorityQueue The priority queue MUST be implemented using a linked list. 2 test program checks that a newly constructed priority queue is empty o checks that a queue with one item in it is not empty o checks that items are correctly entered that would go at the front of the queue o checks that items are correctly entered that would go at the...
Implement a class named Parade using an ArrayList, which will manage instances of class Clown. Each...
Implement a class named Parade using an ArrayList, which will manage instances of class Clown. Each Clown only needs to be identified by a String for her/his name. Always join a new Clown to the end of the Parade. Only the Clown at the head of the Parade (the first one) can leave the Parade. Create a test application to demonstrate building a parade of 3 or 4 clowns (include your own name), then removing 1 or 2, then adding...
Implement a class named stack pair that provides a pair of stacks. Make the class a...
Implement a class named stack pair that provides a pair of stacks. Make the class a template class. So, you will have two files: stack pair.h and stack pair.template, following the style of the text. The basic idea is that two stacks can share a single static array. This may be advantageous if only one of the stacks will be in heavy use at any one time. • The class should have various methods to manipulate the stack: T pop...
You are to write a class named Rectangle. You must use separate files for the header...
You are to write a class named Rectangle. You must use separate files for the header (Rectangle.h) and implementation (Rectangle.cpp) just like you did for the Distance class lab and Deck/Card program. You have been provided the declaration for a class named Point. Assume this class has been implemented. You are just using this class, NOT implementing it. We have also provided a main function that will use the Point and Rectangle classes along with the output of this main...
Which of the following is NOT a sequence of DNA that could be cut by a...
Which of the following is NOT a sequence of DNA that could be cut by a restriction enzyme? a. GAATTC                  b. ATCGAT                 c. GTAC                       d. GTTCCA e. AGATCT Why?
Create a class that generates permutations of a set of symbols. Requirements The class must be...
Create a class that generates permutations of a set of symbols. Requirements The class must be named PermutationGenerator. The PermutationGenerator class has two methods as follows. hasNext This method has no parameters.  It returns true if at least one permutation remains to be generated. next This method has no parameters.  It returns an array of the symbols (char[]) in a permutation (if any remain) or null otherwise. The following main method MUST be used, with NO CHANGES to test your class. public static void main(String[] args) { int count = 0; PermutationGenerator pg...
You sequence the DNA of the organism, and it was normal at this location, with no...
You sequence the DNA of the organism, and it was normal at this location, with no mutations. You sequence the mRNA, and it was also normal. Based on your knowledge of protein translation, and any changes in the peptide that you recover, you assume that there has been some of type of change in the translational, not post-translational mechanism due to the radiation. What is/are a plausible guess to the explain the changes in the peptide recovered versus the one...
The following DNA sequence occurs at the start of a DNA strand: 3′—AATTGCAGATTCA—5′. Which of the...
The following DNA sequence occurs at the start of a DNA strand: 3′—AATTGCAGATTCA—5′. Which of the sequences below would most likely bind to this sequence to initiate DNA replication through the formation of RNA ? A.    5′—TTAACGTCTAAGT—3′ B.    3′—TTAACGTCTAAGT—5′ C.    3′—UUAACGUCUAAGU—5′ D.    5′—UUAACGUCUAAGU—3′
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT