In: Statistics and Probability
8.Write in R code.
a)stimulate a string of 10,000
characters drawn uniformly and independently for the set {A,C,G,T}
[HINT: sample]
b) create a frequency table of the string [HINT:table]
c) write a function to create a
contingency table of adjacent k-tuples. For example, with k=3 and
with the string "CAGACAAAAC", you would want to produce the
following table: [ ONLY USE FOR LOOPS AND PASTE (,collapse=""), DO
NOT USE EMBED, SUBSTR, or DO.CALL]
AAA AAC ACA AGA CAA CAG GAC
2 1 1 1 1 1 1
Answer:
Given that :
a) stimulate a strong of 10,000 characters drawn uniformly and independently for the set{A,C,G,T}
R code with comments
#set the random seed for repeatability
set.seed(123)
#set the string
x<-c("A","C","G","T")
#set the number of simulations
R<-10000
#part a)
#simulate a string of R characters
s<-sample(x,size=R,replace=TRUE)
#part b)
#create a frequency table
table(s)
#part c
getTable<-function(y,k){
#initialize the result
result<-character(0)
#go through the string, k places at a time
for (i in 1:(length(y)-k+1)){
#get k characters and concat
result<-rbind(result,paste(y[i:(i+k-1)],collapse=""))
}
#create the contigency table
out<-table(result)
return(out)
}
#test the function given in the example
s1<-c("C","A","G","A","C","A","A","A","A","C")
getTable(s1,3)
#test the function on the string from part a)
getTable(s,3)
--------
#get this