In: Math
Fixed acidity - Volatile acidity - Citric acid - Residual sugar -Chlorides
7.4 0.7 0 1.9 0.076
7.8 0.88 0 2.6 0.098
7.8 0.76 0.04 2.3 0.092
11.2 0.28 0.56 1.9 0.075
7.4 0.7 0 1.9 0.076
7.4 0.66 0 1.8 0.075
7.9 0.6 0.06 1.6 0.069
7.3 0.65 0 1.2 0.065
7.8 0.58 0.02 2 0.073
7.5 0.5 0.36 6.1 0.071
6.7 0.58 0.08 1.8 0.097
7.5 0.5 0.36 6.1 0.071
5.6 0.615 0 1.6 0.089
7.8 0.61 0.29 1.6 0.114
8.9 0.62 0.18 3.8 0.176
8.9 0.62 0.19 3.9 0.17
8.5 0.28 0.56 1.8 0.092
8.1 0.56 0.28 1.7 0.368
7.4 0.59 0.08 4.4 0.086
7.9 0.32 0.51 1.8 0.341
8.9 0.22 0.48 1.8 0.077
7.6 0.39 0.31 2.3 0.082
7.9 0.43 0.21 1.6 0.106
8.5 0.49 0.11 2.3 0.084
6.9 0.4 0.14 2.4 0.085
6.3 0.39 0.16 1.4 0.08
1. For the data on 26 red wines given above, conduct the following analysis:
i. Provide five-number summary i.e. the minimum, 1st quartile, median, 3rd quartile, and maximum value for fixed acidity. Arrange them in increasing order on a straight line, draw a box plot and interpret what it means.
ii. Calculate the correlation coefficient between fixed acidity and volatile acidity and between residual sugar and chlorides. Comment on the strength and direction of association for the two variable pairs.
iii. What can be stated about the cause-effect relationship between fixed acidity and volatile acidity, based on the correlation coefficient score?
Here N= 26 (even no.)
i. Minimum and maximum value for fixed acidity is 5.6, 11.2 respectively.
now we calculate median for fixed acidity
Formula,
median= M= size of (N+1)/2 th item
= size of(26+1)/2 item
=size of (13.5)th item
=Average of 13th and 14th items
=(5.6 + 7.8)/2 =6.7
Median= 6.7 (median is the 2nd quartile)
1st quartile= Q1 = size of (N+1)/4 th item
=size of (27)/4 th item
= size of(6.75)th item
=size of 6th item+ 0.75(size of 7th item - size of 6th item)
=7.4+0.75(7.9-7.4)
Q1=7.775
3rd quartile= Q3 = size of (3*((N+1)/4 ))th item
=size of 3*(27/4) th item
= size of(20.25)th item
=size of 20th item+ (0.25)*(size of 21th item - size of 20th item)
=7.9+0.25(8.9-7.9)
Q3= 8.15
Now, we plot the box plot here we calculate min, max, Median. Q1, Q3. values also want range=5.6
we can use this value in excel and draw box plot
now we calculate correlation coefficient between fixed acidity(consider as X) and volatile acidity(consider as Y) by using excel formula
simply use in excel =corr(X,Y) we gate correlation coefficient between fixed acidity and volatile acidity which is
-0.3277 |
from above we can say that there have negative correlation fixed acidity and volatile acidity
now we calculate correlation coefficient between residual sugar (consider as X) and Chlorides(consider as Y) by using excel formula
simply use in excel =corr(X,Y) we gate correlation coefficient between residual sugar and Chlorides which is
-0.08304 |
from above we can say that there have negative correlation between residual sugar and Chlorides
cause-effect relationship between fixed acidity and volatile acidity based on the correlation coefficient score is high degree negative correlation i.e the variables are very close to each other but both the variables changes in opposite direction.