Question

In: Statistics and Probability

Using R: The data set “Drink.csv” represents the amount of bio medication filled in a sample...

Using R:

The data set “Drink.csv” represents the amount of bio medication filled in a sample of 50 consecutive 2-liter bottles.
1) At the 0.01 level of significance, can you test whether the mean amount of medication is different from 2.0 liter using the critical value approach? What is the absolute value of the critical points?
2) Can you confirm your conclusion in part a using p value approach? Can you also replicate p value from t.test using the pt function?

3) Can you confirm your conclusion in part a using CI approach based on Question 2?

DRINK DATA:

Amount
2.109
2.086
2.066
2.075
2.065
2.057
2.052
2.044
2.036
2.038
2.031
2.029
2.025
2.029
2.023
2.02
2.015
2.014
2.013
2.014
2.012
2.012
2.012
2.01
2.005
2.003
1.999
1.996
1.997
1.992
1.994
1.986
1.984
1.981
1.973
1.975
1.971
1.969
1.966
1.967
1.963
1.957
1.951
1.951
1.947
1.941
1.941
1.938
1.908
1.894

Expert Solution

Firstly save your data as csv file in excel sheet. Here to call the data in R am using My computers path.

Make sure you choose your own saving path to call the data set.

Let X be the amount of medication. We assume N(µ ,?²) . Where ?² is unknown.

we want to test H₀: µ =2.0 ag H₁: µ =! 2.0 at 0.01 level of significance .

We can perform Student t-test for one sample mean.

HERE IS THE R CODE

data=read.csv("C:\\Users\\HP\\Desktop\\Drink.csv",header=T) # calling the data set
data
str(data) # seeing the variable in the data set. You will see it as Amount.(As saved in the excel file)
attach(data) # attaching the data set to work on

t.test(Amount,mu=2.0,conf.level=0.99)

HERE IS THE RESULT

t.test(Amount,mu=2.0,conf.level=0.99)

One Sample t-test

data: Amount
t = 0.11424, df = 49, p-value = 0.9095
alternative hypothesis: true mean is not equal to 2
99 percent confidence interval:
1.98383 2.01761
sample estimates:
mean of x
2.00072

We reject the null hypothesis if obsereved mod t is greater than the critical value.

Here the observed t is 0.11424

and critical value will be the upper 0.005 point of t 49 distn.

WE CAN GET THE CRITICAL POINT BY USING THE CODE

qt(0.995,49)

RESULT IS

2.679952

So here the observed mod t is not greater than the critical value hence there is not enough evidence to suspect the null hypothesis. so we accept the null hypothesis at level of significance 0.01 or at 0.99 confidence level.

And absolute value of the critical point is 2.679952

2) Obviously we can confirm our conclusion using p value approach. If the p value is less than 0.01 we reject the null hypothesis but here the p value is 0.9095 so we fail to reject the null hypothesis and accept it.

For this test the p value is P_H₀( mod(t)> observed t)=2* P(t₄₉>0.11424)

As the test statistic t follows t distn with 49 df under H₀ . And as t is symmetric wrt 0 so 2 is multiplied.

WE CAN GET THE P VALUE USING THE R CODE

2*(1-pt(0.11424,49))

HERE IS THE RESULT

0.9095144

(Which is same as the p value given in the test result)..

3)

we can confirm our conclusion in part a using CI approach based on Question 2..

The interpretation of the Confidence interval given in test result is that with 99% confidence one can say that the true mean will lie within (1.98383 ,2.01761)

So here we wanted to know wheather the true mean is 2.0 or not which is within the interval so definately we accepth that the average ammount is 2.0

Here is the total R code

""

data=read.csv("C:\\Users\\HP\\Desktop\\Drink.csv",header=T) # calling the data set
data
str(data) # seeing the variable in the data set. You will see it as Amount.(As saved in the excel file)
attach(data) # attaching the data set to work on

t.test(Amount,mu=2.0,conf.level=0.99)

qt(0.995,49) # getting the critical point of the test

2*(1-pt(0.11424,49)) # getting the p value of the test

""

ANd here is the total output

""

> data=read.csv("C:\\Users\\HP\\Desktop\\Drink.csv",header=T) # calling the data set
> str(data) # seeing the variable in the data set. You will see it as Amount.(As saved in the excel file)
'data.frame': 50 obs. of 1 variable:
$ Amount: num 2.11 2.09 2.07 2.08 2.06 ...
> attach(data) # attaching the data set to work on
The following object is masked from data (pos = 3):

Amount

The following object is masked from data (pos = 4):

Amount

>
> t.test(Amount,mu=2.0,conf.level=0.99)

One Sample t-test

data: Amount
t = 0.11424, df = 49, p-value = 0.9095
alternative hypothesis: true mean is not equal to 2
99 percent confidence interval:
1.98383 2.01761
sample estimates:
mean of x
2.00072

>
> qt(0.995,49) # getting the critical point of the test
[1] 2.679952
>
> 2*(1-pt(0.11424,49)) # getting the p value of the test
[1] 0.9095144
>

""

Please help with a thumbs up if you like the answer.

orchestra answered 2 years ago

1. The following data set represents the amount spent (in dollars) by 45 shoppers at a...

1. The following data set represents the amount spent (in dollars) by 45 shoppers at a supermarket. Construct a frequency distribution for the variable, and also report the relative frequencies for each class in your frequency distribution. NOTE: the data is in dollars and cents; you can not change that data! 10.81, 12.69, 13.78, 15.23, 15.62, 17.00, 17.39, 18.36, 18.43, 19.27, 19.50,19.54, 20.16, 20.59, 22.22, 23.04, 24.47, 24.58, 25.13, 26.24, 26.26, 27.65, 28.06, 28.08, 28.38, 32.03, 33.58, 34.98, 36.37, 37.44,...

The data below represents the amount of grams of carbohydrates in a sample serving of breakfast...

The data below represents the amount of grams of carbohydrates in a sample serving of breakfast cereal. 10 18 24 30 19 22 24 20 18 25 20 22 19 what is the variance?

The data below represents the amount that a sample of fifteen customers spent for lunch ($)...

The data below represents the amount that a sample of fifteen customers spent for lunch ($) at a fast-food restaurant: 8.42 6.29 6.83 6.50 8.34 9.51 7.10 6.80 5.90 4.89 6.50 5.52 7.90 8.30 9.60 At the 0.01 level of significance, is there evidence that the mean amount spent for lunch is different from $6.50? Follow and show the 7 steps for hypothesis testing. Determine the p-value and interpret its meaning. What assumption must you make about the population distribution...

The following set of data represents the distribution of annual salaries of a random sample of...

The following set of data represents the distribution of annual salaries of a random sample of 100managers in a large multinational company: Salary range (£` 000' ) Managers 20 but under 25 25 but under 30 30 but under 35 35 but under 40 40 but under 45 45 but under 50 5 10 25 35 25 5 Calculate the mean and standard deviation. [5 Marks] The company chairman claims that the managers in the company earn on average annual...

The data set represents the number of movies that a sample of 20 people watched in...

The data set represents the number of movies that a sample of 20 people watched in a year. 121 148 94 142 170 88 221 106 18 67 149 28 60 101 134 168 92 154 53 66 a.) construct a frequency distribution for the data set using six classes. Include class limits, midpoints, frequencies, relative frequencies, and cumulative frequencies. b.) Display the data using a frequency histogram (Must use EXCEL) c.) Describe the shape of the distribution as symmetric,...

The following data set represents the average number of minutes played for a random sample of...

The following data set represents the average number of minutes played for a random sample of professional basketball players in a recent season. 35.9 33.8 34.7 31.5 33.2 29.1 30.7 31.2 36.1 34.9 a) Find the sample mean and sample standard deviation b) Construct a 90% confidence interval for the population mean and interpret the results. Assume the population is normally distributed. c) Calculate the two standard deviation interval and discuss the difference in meaning from it and the confidence...

Problem #3 1) Conduct a related-sample t-test using the following data set. Each row represents a...

Problem #3 1) Conduct a related-sample t-test using the following data set. Each row represents a pair of scores. A: 18, 35, 31, 30, 40, 25, 11, 30, 28, 20 B: 4, 40, 13, 18, 30, 27, 17, 18, 12, 20 - 2) Determine the critical values (for an alpha of .05) that you should use to evaluate this t-score. - 3) Compute r2 for this t-test. Please be clear on how you get each outcome. Be clear on the...

Mean amount of milk in a bottle that was filled by a set of 32 aun...

Mean amount of milk in a bottle that was filled by a set of 32 aun with a standard deviation of 0.06 aun. Suppose the mean amount of milk widely distributed normally. To ensure that the machine operates well, 36 bottles containing milk randomly and mean amount of milk obtained. a) if test with α = 0.05 is carried out to determine whether the machine works well, specify the criteria of the test rejection. b) About the power of the...

The following data represent the amount of soft drink filled in a sample of 50 consecutive 2- liter bottles.

The following data represent the amount of soft drink filled in a sample of 50 consecutive 2- liter bottles. The results, listed horizontally in the order of being filled, were: 2.109 2.086 2.066 2.075 2.065 2.057 2.052 2.044 2.036 2.038 2.031 2.029 2.025 2.029 2.023 2.02 2.015 2.014 2.013 2.014 2.012 2.012 2.012 2.01 2.005 2.003 1.999 1.996 1.997 1.992 1.994 1.986 1.984 1.981 1.973 1.975 1.971 1.969 1.966 1.967 1.963 1.957 1.951 1.951 1.947 1.941 1.941 1.938 1.908...

Using R program and with a For loop. Assuming a data set of 1000 observations and...

Using R program and with a For loop. Assuming a data set of 1000 observations and 10 predictors. How would one use a for loop to cycle through different proportions of training and test sizes. For example, 20% of data goes to training and 80% for test in first iteration. Each iteration adding another 10% to the training set. So first set= (20% train, 80% test), second set = (30% train, 70% test), third set= (40% train,60%test) and so on....

Question

Using R: The data set “Drink.csv” represents the amount of bio medication filled in a sample...

Solutions

Expert Solution

Related Solutions

1. The following data set represents the amount spent (in dollars) by 45 shoppers at a...

The data below represents the amount of grams of carbohydrates in a sample serving of breakfast...

The data below represents the amount that a sample of fifteen customers spent for lunch ($)...

The following set of data represents the distribution of annual salaries of a random sample of...

The data set represents the number of movies that a sample of 20 people watched in...

The following data set represents the average number of minutes played for a random sample of...

Problem #3 1) Conduct a related-sample t-test using the following data set. Each row represents a...

Mean amount of milk in a bottle that was filled by a set of 32 aun...

The following data represent the amount of soft drink filled in a sample of 50 consecutive 2- liter bottles.

Using R program and with a For loop. Assuming a data set of 1000 observations and...