Question

In: Statistics and Probability

Forty architecture students were each asked to judge 5 different building structures. The response variable of...

Forty architecture students were each asked to judge 5
different building structures. The response variable of interest
is the judge's overall satisfaction (SAT), where a higher score
is better. We wish to compare the mean satisfaction rating across
the five buildings, so the factor of interest is BLDG.

USE R OR SAS TO SOLVE THE PROBLEMS. PLEASE INCLUDE YOUR CODE TO GET THE ANSWER.

I am not sure how the data is not understandable. Literally take the data plug it into r or sas to get the answers. SUBJ - this is one of the fourty testers. BLDG - this is what building model they rated. SAT - is there rating of said building.

(a) Why does it make sense to use the judge (denoted SUBJ in the
data set) as a blocking variable? Why should we treat this block
as a random effect?

(b) Analyze the data as a RBD, where SAT is the response, BLDG is
the treatment factor, and SUBJ is the block. Based on the appropriate
F-test, is there a significant difference in mean satisfaction rating
across the five buildings? NOTE: Use a 0.10 significance level.

(c) Based on the appropriate F-test, is there significant variation
among the judges? NOTE: Use a 0.10 significance level.

(d) Of particular interest to the investigators is whether the mean
satisfaction for building 1 differs significantly from the mean satisfaction
for the other four buildings. Use an ESTIMATE statement to test the
appropriate contrast here. NOTE: Use a 0.10 significance level.

data buildings;                                                                                                                           
input SUBJ BLDG SAT;                                                                 
cards;                                                                                                                                  
 1 1 2
 1 2 5
 1 3 6
 1 4 5
 1 5 7
 2 1 5
 2 2 6
 2 3 6
 2 4 7
 2 5 4
 3 1 4
 3 2 7
 3 3 3
 3 4 6
 3 5 7
 4 1 6
 4 2 4
 4 3 7
 4 4 5
 4 5 7
 5 1 2
 5 2 6
 5 3 4
 5 4 7
 5 5 5
 6 1 4
 6 2 6
 6 3 7
 6 4 5
 6 5 3
 7 1 7
 7 2 5
 7 3 5
 7 4 7
 7 5 4
 8 1 3
 8 2 7
 8 3 6
 8 4 7
 8 5 6 
 9 1 6
 9 2 7
 9 3 8
 9 4 6
 9 5 3
 10 1 5
 10 2 3
 10 3 3
 10 4 5
 10 5 6
 11 1 3
 11 2 6 
 11 3 4 
 11 4 4 
 11 5 3
 12 1 3
 12 2 6
 12 3 7
 12 4 5
 12 5 3
 13 1 4
 13 2 1
 13 3 7
 13 4 1
 13 5 6
 14 1 4
 14 2 6
 14 3 8
 14 4 5
 14 5 1
 15 1 4
 15 2 4
 15 3 4
 15 4 5
 15 5 5
 16 1 8
 16 2 5
 16 3 9
 16 4 9
 16 5 5
 17 1 5
 17 2 5
 17 3 6
 17 4 7
 17 5 5
 18 1 5
 18 2 4
 18 3 6
 18 4 6
 18 5 6
 19 1 2
 19 2 5
 19 3 6
 19 4 2
 19 5 8
 20 1 2
 20 2 8
 20 3 7
 20 4 8
 20 5 2
 21 1 8
 21 2 8
 21 3 8
 21 4 8
 21 5 3
 22 1 5
 22 2 4
 22 3 4
 22 4 3
 22 5 5
 23 1 6
 23 2 6
 23 3 6
 23 4 6
 23 5 4
 24 1 3
 24 2 5
 24 3 8
 24 4 5
 24 5 6
 25 1 6
 25 2 2
 25 3 5
 25 4 7
 25 5 6
 26 1 2
 26 2 7
 26 3 4
 26 4 7
 26 5 2
 27 1 7
 27 2 7
 27 3 7
 27 4 7
 27 5 7
 28 1 8
 28 2 5
 28 3 5
 28 4 6
 28 5 3
 29 1 2
 29 2 6
 29 3 7
 29 4 4
 29 5 5
 30 1 1
 30 2 5
 30 3 5
 30 4 6
 30 5 6
 31 1 9
 31 2 7
 31 3 8
 31 4 2
 31 5 8
 32 1 6
 32 2 9
 32 3 1
 32 4 8
 32 5 4
 33 1 2
 33 2 6
 33 3 8
 33 4 9
 33 5 8
 34 1 8
 34 2 4
 34 3 3
 34 4 3
 34 5 9
 35 1 2
 35 2 7
 35 3 2
 35 4 9
 35 5 2
 36 1 2
 36 2 9
 36 3 1
 36 4 8
 36 5 3
 37 1 7
 37 2 2
 37 3 3 
 37 4 3
 37 5 6
 38 1 3
 38 2 7
 38 3 3
 38 4 2
 38 5 2
 39 1 3
 39 2 3
 39 3 5
 39 4 3
 39 5 3
 40 1 9
 40 2 5
 40 3 8
 40 4 7
 40 5 8    
;
run; 

Solutions

Expert Solution

I have included all the answers in the R script below:

 # After the data is loaded into R head(buildings) # Extracting subject and building as factors SAT <- buildings[, "SAT"] SUBJ <- factor(buildings[, "SUBJ"]) BLDG <- factor(buildings[, "BLDG"]) # (a) Why does it make sense to use the judge (denoted SUBJ in the # data set) as a blocking variable? Why should we treat this block # as a random effect? # Let us ignore the blocks and just do a one-way CRD ANOVA av.CRD <- aov(SAT~BLDG) summary(av.CRD) 
 Df Sum Sq Mean Sq F value Pr(>F) BLDG 4 33.6 8.392 1.956 0.103 Residuals 195 836.7 4.291 # The p-value of the obtained F-statistic for BLDG # for the buildings is 0.103. From here, we can be tempted # conclude that there is no significant difference in the SAT # ratings of the 5 buildings. BUT THIS IS WRONG. # Why is this wrong? # Because we forgot to account for the variation in the ratings # of these 40 Judges. The inherent biases in the ratings they give # can influence the ratings of the Buildings. These are said to have # random effects. We can account for these variations by treating SUBJ as # a blocking variable. # Blocking is used to remove the effects of a few of the most important # nuisance variables. Randomization is then used to reduce the contaminating # effects of the remaining nuisance variables. For important nuisance variables, # blocking will yield higher significance in the variables of interest # than randomizing 
# b) Analyze the data as a RBD, where SAT is the response, BLDG is # the treatment factor, and SUBJ is the block. Based on the appropriate # F-test, is there a significant difference in mean satisfaction rating # across the five buildings? # Now perform the Analysis of variance # SAT = response # BLDG = treatment factor # SUBJ is the blocking factor av = aov(SAT~BLDG+SUBJ) summary(av)
 Df Sum Sq Mean Sq F value Pr(>F) BLDG 4 33.6 8.393 2.032 0.0926 . SUBJ 39 192.3 4.931 1.194 0.2236 Residuals 156 644.4 4.131 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # Ans (b) # The p-value of the obtained F-statistic for BLDG # for the buildings is 0.0926. Since it is less # than the significance level alpha=0.1, we can # conclude that there is significant difference # in mean satisfaction rating across the five buildings # Ans (c) # The p-value of the obtained F-statistic for SUBJ # for the buildings is 0.2236. Since it is more # than the significance level alpha=0.1, we can # conclude that there is NO significant variation # among the judges
# (d) Testing whether the mean satisfaction for building 1 # differs significantly from the mean satisfaction # for the other four buildings # Choosing the appropriate contrasts (they should sum up to 0) contrasts(BLDG) <- matrix(c(1, -1/4, -1/4, -1/4, -1/4), nrow=5, ncol=1) # This will give us the the two-sided P-value of t-test # about contrast. # Here we fit the Multiple Linear Regression model # and explore the summary output summary(lm(SAT ~ BLDG + SUBJ))$coef["BLDG1", ]
 Estimate Std. Error t value Pr(>|t|) -0.64500000 0.28743561 -2.24398082 0.02624094 # Ans (d) # Since the p-value = 0.0262 is very less than the # the significance level alpha = 0.1, we can conclude # YES, the mean satisfaction for building 1 # differs significantly from the mean satisfaction # for the other four buildings

Please upvote and provide feedback if this answer helped you. This would help me improve and better my solutions.
I will be happy to answer your doubts, if any in the comment section below. Thanks! :)


Related Solutions

Forty randomly selected students were asked the number of pairs of sneakers they owned. Let X...
Forty randomly selected students were asked the number of pairs of sneakers they owned. Let X = the number of pairs of sneakers owned. The results are as follows. X Frequency 1 3 2 3 3 7 4 13 5 13 6 1 Find: Sample Mean, Standard Deviation, Relative Frequency and Cumulative Relative Frequency, First and Third Quartiles, What percent of the students owned at least five pairs? (Round your answer to one decimal place.), 40th Percentile and 90th Percentile.
Forty professional aeronautics students enrolled in a psychology course were asked how many hours they had...
Forty professional aeronautics students enrolled in a psychology course were asked how many hours they had studied during the past weekend. Their responses are provided below. 11 2 0 13 5 7 1 8 12 11 7 8 9 10 7 4 6 10 4 7 8 6 7 10 7 3 11 18 2 9 7 3 8 7 3 13 9 8 7 7 For the data above, construct a frequency table and histogram. Evaluate the graphics. Is...
In 2003, forty percent of the students at a major university were Business majors, 35% were...
In 2003, forty percent of the students at a major university were Business majors, 35% were Engineering majors and the rest of the students were majoring in other fields. In a sample of 600 students from the same university taken in 2004, two hundred were Business majors, 220 were Engineering majors and the remaining students in the sample were majoring in other fields. At 95% confidence, test to see whether there has been a significant change in the proportions between...
Students in a different statistics class were asked to report the age of their mothers when...
Students in a different statistics class were asked to report the age of their mothers when they were born. Here are the summary statistics: Sample size: 28     sample mean: 29.643 years         sample stDev: 4.564 years Calculate the standard error of this sample mean. Determine and interpret a 90% confidence interval for the mother’s mean age (at student’s birth) in the population of all students at the college. How would a 99% confidence interval compare to the 90% confidence interval...
In Cloud architecture, what are the different phases involved? Explain each a. In cloud architecture, what...
In Cloud architecture, what are the different phases involved? Explain each a. In cloud architecture, what are the building blocks? Explain each
1) In each scenario below, specify each variable as a response variable, an explanatory variable, or...
1) In each scenario below, specify each variable as a response variable, an explanatory variable, or neither. a. A researcher collects measurements of VO2 max and resting heart rate on a group of subjects to study the relationship between these two variables. b. A real estate agent wants to be able to predict selling prices of houses in Vancouver. He collects data on 100 recently sold houses, recording their selling prices, size, age, number of bedrooms, and whether they had...
In each scenario below, specify each variable as a response variable, an explanatory variable, or neither....
In each scenario below, specify each variable as a response variable, an explanatory variable, or neither. Explain your choices. a. A climatologist wishes to predict future monthly rainfall in Los Angeles. To inform his predictive model, for each month of the past 30 years, he records the name of the month (Jan.-Dec.), total rainfall (mm), and the Oceanic Niño Index (a measure of sea surface temperature differences, in ºC). b. A researcher conducts an experiment in a residence for senior...
Forty randomly selected people were asked the number of people living in their home. Let X...
Forty randomly selected people were asked the number of people living in their home. Let X = the number of people living in one home. The results are as follows:    x frequency relative frequency 1 2 2 5 3 8 4 12 5 12 6 0 7 1 Complete the frequency table. Find the sample mean ? c. Find the sample standard deviation, s. What does this tell us about the data? d. Construct a histogram of the data....
Describe three different factors, two levels for each, and one response variable to design an experiment...
Describe three different factors, two levels for each, and one response variable to design an experiment where you try to figure out what is the best way to knock down all the bowling pins during a game of bowling.
Twenty students were surveyed this week at MPC, these students were asked how many units they...
Twenty students were surveyed this week at MPC, these students were asked how many units they were enrolled in for the semester. Below are the responses from the twenty students. 13 15 9 12 4 14 15 8 1 6 11 8 18 3 12 10 14 11 7 12 a. Make a frequency distribution of this data. Use 5 classes (aka bins) with a class width of 4, and let the first class have a lower limit of 1....
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT