Question

In: Statistics and Probability

A medical statistician wanted to examine the relationship between the amount of sunshine (x) and incidence...

  1. A medical statistician wanted to examine the relationship between the amount of sunshine (x) and incidence of skin cancer (y). As an experiment he found the number of skin cancers detected per 100,000 of population and the average daily sunshine in eight counties around the country. These data are shown below.

Average Daily Sunshine

5

7

6

7

8

6

4

3

Skin Cancer per 100,000

7

11

9

12

15

10

7

5

  1. Find the least squares regression line.
  2. Draw a scatter diagram of the data and plot the least squares regression line on it.
  3. Calculate the coefficient of determination and interpret it.
  4. Calculate the coefficient of correlation and what does the coefficient of correlation calculated tell you about the direction and strength of the relationship between the two variables?

Solutions

Expert Solution

Solution(a)
Least regression line can be written as
Y = a + bX
Number of country n = 8
Here Y is dependent variable i.e. Skin Cancer
X is independent variable i.e. Average Daily sunshine
a is intercept of regression line
b is slope of regression line
Slope of regression line can be calculated as
Slope = ((n*Xi*Yi)-(Xi * Yi))/((n*Xi^2)-(Xi)^2))

X

Y

X^2

Y^2

XY

5

7

25

49

35

7

11

49

121

77

6

9

36

81

54

7

12

49

144

84

8

15

64

225

120

6

10

36

100

60

4

7

16

49

28

3

5

9

25

15

46

76

284

794

473


Slope = ((8*473)-(46*76))/((8*284)-(46*46)) = 288/156 = 1.846
Intercept of regression line can be calculated as
Intercept = (Yi - Slope *Xi)/n = (76 - 1.846*46)/8 = -1.115
So regression equation can be calculated as
Y = -1.115 + 1.846*X
Solution(b)
Scatter diagram can be constructed as

Solution(c)
Coefficient of detemination can be calculated as
Coefficient of determination = (Correlation coefficient)^2
Correlation coefficient can be calculated as
Correlation coefficient = ((n*Xi*Yi)-(Xi * Yi))/sqrt(((n*Xi^2)-(Xi)^2))*((n*Yi^2)-(Yi)^2))) = ((8*473)-(46*76))/sqrt(((8*284)-(46*46))*((8*794)-(76*76)) = 288/sqrt(156*576) = 0.9607
Coefficient of determination = (0.9607)^2 = 0.9231
So Coefficient of determination can be interpreted as 92.31% variance is explained in skin cancer due to change in Average Daily Sunshine.
Solution(d)
Correlation coefficient can be calculated as
Correlation coefficient = ((n*Xi*Yi)-(Xi * Yi))/sqrt(((n*Xi^2)-(Xi)^2))*((n*Yi^2)-(Yi)^2))) = ((8*473)-(46*76))/sqrt(((8*284)-(46*46))*((8*794)-(76*76)) = 288/sqrt(156*576) = 0.9607
Correlation coefficient can be interpreted as both varibles skin cancer and Average daily sunshine are positively correlated with each other and both variables are strongly correlated with each other. If one variable increases than other variable also increase.


Related Solutions

A researcher wanted to examine if there is a relationship between color preference (i.e., favorite color:...
A researcher wanted to examine if there is a relationship between color preference (i.e., favorite color: red, yellow, green or blue) and personality type (i.e., extrovert or introvert). He selected a sample of n = 200 participants; each participant took a personality test and identified the favorite color. The collected data are shown in the table below. Do the data indicate a significant relationship between personality type and color preference? Test with p < .01. RED YELLOW GREEN BLUE INTROVERT...
Costco sells paperback books in their retail stores and wanted to examine the relationship between price...
Costco sells paperback books in their retail stores and wanted to examine the relationship between price and demand. The price of a particular novel was adjusted each week and the weekly sales were recorded in the table below. Sales Price 3 $12 4 $11 6 $10 10 $9 8 $8 10 $7 Management would like to use simple regression analysis to estimate weekly demand for this novel using the price of the novel. The 95% confidence interval that estimates the...
Costco sells paperback books in their retail stores and wanted to examine the relationship between price...
Costco sells paperback books in their retail stores and wanted to examine the relationship between price and demand. The price of a particular novel was adjusted each week and the weekly sales were recorded in the table below. Sales Price 3 $12 4 $11 6 $10 10 $9 8 $8 10 $7 Management would like to use simple regression analysis to estimate weekly demand for this novel using the price of the novel. The 95% confidence interval that estimates the...
Exercises 2 In a survey, a statistician analyzed the relationship between the Mothers-Heights (explanatory variable x)...
Exercises 2 In a survey, a statistician analyzed the relationship between the Mothers-Heights (explanatory variable x) and the Daughters-Heights (response y) by using the regression method. When he drew the regression line, he found that at = 0 the predicted response is = 29.9, and at = 1 the predicted response is = 30.43. Suppose that the standard deviations of Daughters-Heights is = 2.6 and Mothers-Heights is = 2.35. Please answer the following questions underneath each question. 1. Find the...
A sports statistician is interested in determining if there is a relationship between the number of...
A sports statistician is interested in determining if there is a relationship between the number of home team and visiting team losses and different sports. A random sample of 526 games is selected and the results are given below. Calculate the​ chi-square test statistic chi Subscript 0 Superscript 2 to test the claim that the number of home team and visiting team losses is independent of the sport. Use alphaequals0.01. Football Baseball Soccer Basketball Home Losses 39 156 25 83...
Explain the mathematical relationship between incidence and prevalence
Explain the mathematical relationship between incidence and prevalence
3) A sports statistician is interested in determining if there is a relationship between the number...
3) A sports statistician is interested in determining if there is a relationship between the number of home team and visiting team losses and different sports. A random sample of 526 games is selected and the results are given below. Find the critical value Λ 2 0 to test the claim that the number of home team and visiting team losses is independent of the sport. Use ΅ = 0.01. Football Basketball Soccer Baseball Home Team Loss 39 156 25...
A farmer was interested in a relationship between the amount of fertilizer (x) and the number...
A farmer was interested in a relationship between the amount of fertilizer (x) and the number of bushels (y) of soybeans produced. The farmer conducted an experiment and obtained the following data. Hundreds of pounds per acre (x) Bushels per acre (y) 1.0 25 2.5 32 3.0 35 3.0 32 3.4 35 4.0 39 4.0 41 4.5 40 Draw a scatter plot. Do the sample data appear to indicate a linear relationship between the amount of fertilizer and the number...
A lecturer wanted to analyze the relationship between land area (X, square metre) in the locality...
A lecturer wanted to analyze the relationship between land area (X, square metre) in the locality of a certain district in one of the state in Malaysia and the land price (Y, RM million), by using nine plots selected randomly. The following statistics were obtained. ∑ ? = 5233; ∑ ? 2 = 3120639;∑ ? = 14.9; ∑ ? 2 = 28.89; ∑ ?? = 9217.8 a) Find the equation of the least squares regression line relating the land area...
Regression and Correlation Examine the relationship between recreational facilities and adult obesity. What is your x...
Regression and Correlation Examine the relationship between recreational facilities and adult obesity. What is your x variable and why? What is your y variable and why? What is the correlation coefficient (r)? What does this mean concerning the relationship between facilities and adult obesity? What is r2? What does this mean(interpret it in a sentence)? What would be the slope and y-intercept for a regression line based on this data? What is your p-value? How do you interpret this? Adult...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT