In: Math
2. Make a data frame consisting of 20 and 10 columns. Each
column j should consist of 20 values from a normal distribution
with mean (i-1) and standard deviation 0.5j. For example, the third
column should be normal(mean=2, sd=1.5). Using this data frame, do
each of the following (using code, of course):
a. Find the mean and standard deviation for each column.
b. Write code that counts the number of columns for which the
sample mean and sample standard deviation are within 20% of the
values used to generate the data.
c. Write code that writes the columns from part b to a new data
frame.
d. For each value in the new data frame, subtract its column mean
and divide by the column standard deviation.
Solution using r and python
Solution using r
# Create matrix of order 20 by 10
M=matrix(rep(0,20*10),nrow=20,ncol=10)
for(j in 1:10){
M[,j]=rnorm(20,mean=j-1,sd=j*0.5)
}
a. Find the mean and standard deviation for each column.
#Find mean and stored in 21st row
M1=rbind(M,apply(M,2,mean))
#Find mean and stored in 22nd row
M1=rbind(M1,apply(M,2,sd))
b. Write code that counts the number of columns for which the sample mean and sample standard deviation are within 20% of the values used to generate the data.
count=0
for(j in 1:10){
mean=j-1
sd=j*0.5
mean1=0
if(M1[21,j]>0.8*mean & M1[21,j]<1.2*mean ){
mean1=1
}
sd1=0
if(M1[22,j]>0.8*sd & M1[22,j]<1.2*sd){
sd1=1
count=count+mean1*sd1
}
c. Write code that writes the columns from part b to a new data frame.
M2=M
d. For each value in the new data frame, subtract its column mean and divide by the column standard deviation.
for(j in 1:10){
M2[,j]=(M2[,j]-mean(M2[,j]))/sd(M2[,j])
}