Question

In: Statistics and Probability

Use the Program R: The following data gives the weight for 8 corn cobs which were...

Use the Program R:

The following data gives the weight for 8 corn cobs which were produced using an organic corn fertilizer:

212, 234, 259, 189, 245, 176, 203, 215

(a) For this sample of n = 8 observations, use R to obtain the mean, the median, the interquartile range and the standard deviation.

(b) Among the four statistics, which are measures of central tendency and which are measures of dispersion.

(c) Are there any outliers in this sample? If so, which values are outliers?

(d) With R, produce a quantile-quantile plot for these data. Is it reasonable to assume that the weight is normally distributed? Explain.

(e) Give a point estimate for the mean weight of the population and the standard deviation of the estimate.

Solutions

Expert Solution

SolutionA:


corn_cobs <- c(212, 234, 259, 189, 245, 176, 203, 215)
mean(corn_cobs)
median(corn_cobs)
sd(corn_cobs)
fivenum(corn_cobs)

Output:

> mean(corn_cobs)
[1] 216.625
> median(corn_cobs)
[1] 213.5
> sd(corn_cobs)
[1] 28.09645
> fivenum(corn_cobs)
[1] 176.0 196.0 213.5 239.5 259.0

mean=216.625

median=213.5

standard deviation =28.09645

IQR=Q3-Q1=239.5-196.0 =43.5

(b) Among the four statistics, which are measures of central tendency and which are measures of dispersion.

Meana nd median are measures of central tendency

stanard deviation and IQR are measures of dispersion.

Solutionc:

(c) Are there any outliers in this sample? If so, which values are outliers?

Rcode to get boxplot is


boxplot(corn_cobs,main="boxplot for corn_cobs")

oUTPUT:

From bxplot we see there are no outliers

Solutiond

qqnorm(corn_cobs)
qqline(corn_cobs)

output:

From QQ plot the points are not on starigh line deviates form normality

Test for normality:

shapiro.test(corn_cobs)

From shapiro test output:

Shapiro-Wilk normality test

data: corn_cobs
W = 0.97845, p-value = 0.9548

p=0.9548,p>0.05 Data devaites from normal distribution.

(e) Give a point estimate for the mean weight of the population and the standard deviation of the estimate.

Point estiamte for pop mean=sample mean=216.625

Point estiamte for pop standard deviation=sample stddev=28.09645


Related Solutions

warpbreaks is a built-in R dataset which gives This data set gives the number of warp...
warpbreaks is a built-in R dataset which gives This data set gives the number of warp breaks per loom, where a loom corresponds to a fixed length of yarn. We are interested in some descriptive statistics related to the warpbreaks dataset. We can access this data directly and convert the time series into a vector by using the assignment x <- warpbreaks$breaks. (In R, use ? warpbreaks for info on this dataset.) The values of x if assigned as above...
The following table gives the bushels of corn per acre, y resulting from the use of...
The following table gives the bushels of corn per acre, y resulting from the use of the various amounts of fertilizer in pounds per acre, x, produced on a farm in each of 10 years from 1971 to 1980.   Year n y x 1971 1 40 6 1972 2 44 10 1973 3 46 12 1974 4 48 14 1975 5 52 16 1976 6 58 18 1977 7 60 22 1978 8 68 24 1979 9 74 26 1980...
A clinic offers a​ weight-loss program. The table below gives the amounts of weight​ loss, in​...
A clinic offers a​ weight-loss program. The table below gives the amounts of weight​ loss, in​ pounds, for a random sample of 20 of its clients at the conclusion of the program. Assume that the data are normally distributed. Complete parts​ (a) and​(b). 10 11 12 24 20 21 7 8 18 19 17 5 15 8 17 16 13 16 14 23 a.Find a 98​% confidence interval for the population mean. The 98% confidence interval is from a lower...
Please use R to solve part e and f The data file data2.txt gives a data...
Please use R to solve part e and f The data file data2.txt gives a data set with two variables x and y. The first column in the data set is just row numbers not useful for this question. (e) Use the Shapiro-Wilks test to test for Normality of the data. State your null and alternative hypotheses, p-value and conclusion. Use α = 0.05 (f) Apply the transformation y 0 = log(y) and run the regression on y 0 on...
USE R AND SHOW CODES 2. The following data were collected in a multisite observational study...
USE R AND SHOW CODES 2. The following data were collected in a multisite observational study of medical effectiveness in Type II diabetes. These sites were involved: a healthy maintenance organization (HMO), a university teaching hospital (UTH), and an independent practice assumption (IPA). The following data display the treatment regimens of patients measured at baseline by site. Use the data to test that no difference in treatment regimens across sites. (in addition, calculate the expected frequency for each cell.)                                                              ...
The R output below gives the summary of the multiple regression model for birth weight based...
The R output below gives the summary of the multiple regression model for birth weight based on both gestation length and smoking status: lm(formula = Weight ~ Weeks + SmokingStatus, data = births) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1724.42 558.84 -3.086 0.00265 ** Weeks 130.05 14.52 8.957 2.39e-14 *** SmokingStatusSmoker -294.40 135.78 -2.168 0.03260 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 484.6 on 97 degrees...
***Use R/STATA to perform the following analysis Data: ShareValue.xlsx contains data on N=309 firms which sold...
***Use R/STATA to perform the following analysis Data: ShareValue.xlsx contains data on N=309 firms which sold new shares. Data on the following variables is provided. All variables are measured in millions of US dollars. ShrVal is the dependent variable and the all the remaining variables are the explanatory variables. ShareValue: the total value of all shares outstanding, calculated as the price per share times the number of shares outstanding. FirmDebt: firm’s long-term debt TotalSales: sales of the firm. Net_Income: net...
** Use R for the following analysis. Use the BoneAcid.xlsx data to check what is causing...
** Use R for the following analysis. Use the BoneAcid.xlsx data to check what is causing the variation in the acid content in bones among 42 male skeletons from 2 cemeteries. The independent variables included are internment lengths, ages, depths, lime addition and contamination in soil. Variables/Columns Burial Site   (1 or 2) Internment Time (Years) Burial Depth (feet)    LimeAdded (at internment) (1=Yes, 0=No) Death_Age (Age of Person at the time of death) Acid Level (g/100g of bone) Contamination (In soil)...
In R, Use the anorexia data set for questions 8-15 Recall there are three distinct treatments....
In R, Use the anorexia data set for questions 8-15 Recall there are three distinct treatments. You may be suspicious that there is a significant difference in the mean weights assigned to the different treatments. It is not a good experimental design if one of the treatments has subjects who are starting in a better position than the subjects of other treatments. To test if this is a well-designed experiment we perform an ANOVA analysis .The null hypothesis is the...
The data below are yields for two different types of corn seed that were used on...
The data below are yields for two different types of corn seed that were used on adjacent plots of land. Assume that the data are simple random samples and that the differences have a distribution that is approximately normal. Construct a​ 95% confidence interval estimate of the difference between type 1 and type 2 yields. What does the confidence interval suggest about farmer​ Joe's claim that type 1 seed is better than type 2​ seed? Type_1   Type_2 2060   2067 1983  ...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT