In: Computer Science
You shall implement a class named DnaSequence, which models a sequence of DNA. Requirements:
You must implement all public constructors and methods described in this documentation.
Your class must have only 1 field, and it must be a private char[].
No print statements are allowed anywhere in your class.
The constructors require you to discard any character that is not
'A', 'C', 'G', or 'T'. This is very easily accomplished using a
regular expression in String.replaceAll:
// returns a String containing "ACG" "A B C D E F G".replaceAll("[^ATCG]", "")
Here is a screenshot of the top of a partially redacted version of my DnaSequence.java file:
Testing
You'll probably want to create another class with a main method that tests your DnaSequence class, e.g.:
import java.util.Arrays; public class DnaSequenceTests { public
static void main(String[] args) { DnaSequence seq = new
DnaSequence(new char[] { 'G', 'A', 'T', 'T', 'A', 'C', 'A' });
assert "GATTACA".equals(seq.toString()); assert seq.baseAt(2) ==
'T'; assert seq.gcContent() == 2.0 / seq.length();
seq.complement(0); assert seq.equals(new DnaSequence("CATTACA"));
assert Arrays.equals(seq.nucleotideCounts(), new int[] { 3, 2, 0, 2
}); } }
Class DnaSequence
java.lang.Object
DnaSequence
public class DnaSequence extends Object
Represents a simple DNA sequence.
Author:
A hardworking student for CS 12J, [email protected]
Field Summary
Modifier and Type | Field | Description |
---|---|---|
private char[] | dna |
The actual sequence of nucleotides is stored in this array, invisible to the outside world, only directly accessible within this class. |
Constructor Summary
Constructor | Description |
---|---|
DnaSequence(char[] dna) |
Constructs a new DNA sequence from the contents of a char[]. |
DnaSequence(String dna) |
Constructs a new DNA sequence from the contents of a String. |
Method Summary
All MethodsInstance MethodsConcrete Methods
Modifier and Type | Method | Description |
---|---|---|
char | baseAt(int index) |
Returns the nucleotide base char value at the specified index. |
void | complement(int index) |
Mutator method that flips one base of this sequence to its complement. |
boolean | equals(DnaSequence that) |
Compares this sequence to another. |
double | gcContent() |
Calculates and returns the GC-content of this DNA sequence. |
int | hammingDistance(DnaSequence that) |
Calculates and returns the Hamming distance between this DNA sequence and another. |
int | length() |
Returns the length of this DNA sequence. |
boolean[] | mutationPoints(DnaSequence that) |
Calculates and returns where two DNA sequences of equal lengths differ. |
int[] | nucleotideCounts() |
Calculates and returns the number of times each type of nucleotide occurs in this DNA sequence. |
DnaSequence | reverseComplement() |
Calculates and returns the reverse complement of this DNA sequence as a new DnaSequence. |
String | toString() |
Returns a string representation of this sequence (for example, "ATCCGTGGACT"). |
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitField Details
dna
private char[] dna
The actual sequence of nucleotides is stored in this array, invisible to the outside world, only directly accessible within this class.
Constructor Details
DnaSequence
public DnaSequence(String dna)
Constructs a new DNA sequence from the contents of a String. Discards any invalid characters in the string, e.g. any character that is not 'A', 'T', 'C', or 'G'.
Parameters:
dna - a string containing characters that represent a DNA sequence
DnaSequence
public DnaSequence(char[] dna)
Constructs a new DNA sequence from the contents of a char[]. Discards any invalid characters in the string, e.g. any character that is not 'A', 'T', 'C', or 'G'. Does not modify nor retain a reference to the parameter array.
Parameters:
dna - an array containing char values representing a DNA sequence
Method Details
length
public int length()
Returns the length of this DNA sequence. The length is equal to the number of nucleotides in the sequence.
Returns:
the length of this DNA sequence
toString
public String toString()
Returns a string representation of this sequence (for example, "ATCCGTGGACT").
Overrides:
toString in class Object
Returns:
a String containing all nucleotides in this sequence, in order
equals
public boolean equals(DnaSequence that)
Compares this sequence to another. The result is true if and only if this sequence represents the same sequence of nucleotides as the other.
Parameters:
that - the other sequence
Returns:
whether this sequence represents the same sequence of nucleotides as the other
baseAt
public char baseAt(int index)
Returns the nucleotide base char value at the specified index. An index ranges from 0 to this.length() - 1. The first base value of the sequence is at index 0, the next at index 1, and so on, as in array indexing.
Parameters:
index - the index of the base value
Returns:
the base value at the specified index of this sequence. The first base value is at index 0.
nucleotideCounts
public int[] nucleotideCounts()
Calculates and returns the number of times each type of nucleotide occurs in this DNA sequence.
Returns:
an int array of length 4, where indices 0, 1, 2 and 3 contain the number of 'A', 'C', 'G' and 'T' characters (respectively) in this sequence
reverseComplement
public DnaSequence reverseComplement()
Calculates and returns the reverse complement of this DNA sequence as a new DnaSequence. In DNA sequences, 'A' and 'T' are complements of each other, as are 'C' and 'G'. The reverse complement is formed by reversing the symbols of a sequence, then taking the complement of each symbol (e.g., the reverse complement of "GTCA" is "TGAC").
Returns:
a new DnaSequence representing the reverse complement of this sequence
gcContent
public double gcContent()
Calculates and returns the GC-content of this DNA sequence. The GC-content of a DNA sequence is given by the percentage of symbols in the string that are 'C' or 'G'. For example, the GC-content of "AGCTATAG" is .375 (37.5%).
Returns:
the GC-content of this sequence, to double precision
hammingDistance
public int hammingDistance(DnaSequence that)
Calculates and returns the Hamming distance between this DNA sequence and another. The Hamming distance between two sequences is the number of points in the sequences where the corresponding symbols differ. For example, the Hamming distance between "ATTATGC" and "ATGATCC" is 2.
Parameters:
that - the other sequence
Returns:
the Hamming distance between this sequence and the other, or -1 if the two sequences are of unequal length
mutationPoints
public boolean[] mutationPoints(DnaSequence that)
Calculates and returns where two DNA sequences of equal lengths differ. For example, given sequences "ATGT" and "GTGA", the result should be array [true, false, false, true ].
Parameters:
that - the other sequence
Returns:
an array of boolean values, of length equivalent to both sequences' lengths, containing true in each index where the two sequences differ, and false where they do not differ. If the two sequences are of unequal length, this method returns {null}.
complement
public void complement(int index)
Mutator method that flips one base of this sequence to its complement. In DNA sequences, 'A' and 'T' are complements of each other, as are 'C' and 'G'. For example, if feline is a reference to a DnaSequence object representing the sequence "GATCAT", then subsequent to invocation feline.complement(0), the represented sequence shall be "CATCAT".
Parameters:
index - the index at which to perform the complement
The code for the above class is as follows:
import java.util.Arrays;
class DnaSequence {
private char[] dna;
DnaSequence(char[] dna) {
this.dna = new String(dna).replaceAll("[^ATCG]", "").toCharArray();
}
DnaSequence(String dna) {
this.dna = dna.replaceAll("[^ATCG]", "").toCharArray();
}
//getters
public char baseAt(int index) {
return dna[index];
}
public void complement(int index) {
char c = baseAt(index);
switch(c) {
case 'A':
dna[index] = 'T';
break;
case 'T':
dna[index] = 'A';
break;
case 'C':
dna[index] = 'G';
break;
case 'G':
dna[index] = 'C';
break;
}
}
public char complement(int index, char[] array) {
char c = array[index];
char result;
switch(c) {
case 'A':
result = 'T';
break;
case 'T':
result = 'A';
break;
case 'C':
result = 'G';
break;
case 'G':
result = 'C';
break;
default:
result = array[index];
break;
}
return result;
}
public char[] getDna() {
return this.dna;
}
public boolean equals(DnaSequence that) {
return this.equals(that);
}
public double gcContent() {
int count = 0;
for(char c: dna ) {
if(c == 'C' || c == 'G') {
count++;
}
}
return count * 100/dna.length;
}
public int hammingDistance(DnaSequence that) {
int length = 0;
int hammingDistance = 0;
if(dna.length != that.length()) {
return -1;
}else {
length = dna.length;
}
char[] anotherDna = that.getDna();
for (int i =0; i < length; i++) {
if (dna[i] != anotherDna[i]) {
hammingDistance++;
}
}
return hammingDistance;
}
public int length() {
return dna.length;
}
public boolean[] mutationPoints(DnaSequence that) {
char[] anotherDna = that.getDna();
boolean[] result;
if(dna.length != anotherDna.length) {
return null;
} else {
result = new boolean[dna.length];
}
for(int i = 0; i< dna.length; i++) {
if(dna[i] == anotherDna[i]) {
result[i] = false;
} else {
result[i] = true;
}
}
return result;
}
public int[] nucleotideCounts() {
int countA = 0;
int countC = 0;
int countG = 0;
int countT = 0;
int[] result = new int[4];
for(int i =0; i< dna.length; i++ ) {
if(dna[i] == 'A') countA++;
else if(dna[i] == 'C') countC++;
else if(dna[i] == 'G') countG++;
else if(dna[i] == 'T') countT++;
}
result[0] = countA;
result[1] = countC;
result[2] = countG;
result[3] = countT;
return result;
}
public char[] reverse(){
char result[] = new char[dna.length];
int count = 0;
for(int i = dna.length-1; i >= 0; i++) {
result[count++] = dna[i];
}
return result;
}
public DnaSequence reverseComplement() {
char[] reverse = reverse();
for(int i = 0; i < reverse.length; i++) {
reverse[i] = complement(i, reverse);
}
DnaSequence newDna = new DnaSequence(reverse);
return newDna;
}
public String toString() {
return new String(dna);
}
}
public class DnaSequenceTests {
public static void main(String[] args) {
DnaSequence seq = new DnaSequence(new char[] { 'G', 'A', 'T', 'T', 'A', 'C', 'A' });
assert "GATTACA".equals(seq.toString());
assert seq.baseAt(2) == 'T';
assert seq.gcContent() == 2.0 / seq.length();
seq.complement(0);
assert seq.equals(new DnaSequence("CATTACA"));
assert Arrays.equals(seq.nucleotideCounts(), new int[] { 3, 2, 0, 2 });
}
}
Just to ensure that after performing the test cases, I didn't receive the error I am attaching the output image
If you have any doubts, you can ask in the comments section. Thank you.