Question

In: Statistics and Probability

Create a scenario for two variables that may be related. Identify a sample dataset (sample size...

  1. Create a scenario for two variables that may be related. Identify a sample dataset (sample size n=10) and using it calculate covariance and correlation values. Interpret the relationship.

Solutions

Expert Solution

Solution:

A basic concept of the correlation coefficient r and covariance between two (random) variables X & Y and their necessary interpretation is required as a pre-requisite to understand the solution given to the problem.

We take 10 sample X-values and use a simple polynomia function to define Y.

Basically we have taken a U(2,4) distribution and simulated 10 values in R for illustration of the covariance and correlation values in this problem. The R code is as:

X<-runif(10,2,4) #10 random numbers are generated from a Uniform distribution U(2,4)
Y=2*X+5 #We square the 10 X values
sort(X, decreasing = F)
sort(Y, decreasing = F)

The outputs are as:

Clearly, as X increases, the values of Y also increases (from the equation Y defined to explain the relationship between X & Y)

1. Now, the value of Correlation between X & Y asseses the strength of linear relationship between X & Y

Since r measures the strength of linear relationship, that is why Y and X have been related as a linear function.

(Non-linear finctions could have been also taken)

Here in this problem, the R code is :

corr<-cor(X,Y)

The value of r comes out to be1, which is very obvious.

Hence there is a high and positive linear relationship between the values of X & Y. It is obvious from the values of X & Y, which are linearly relateed to each other.

2. Covariance between 2 variables X & Y denotes the relationship between X & Y whenever one variable changes.

a) If an increase/decrease in one variable results in increase/decrease in the other variable, both variables are said to have positive covariance.

b) If the increase in one variable results in the decrease in the other variable, and vice versa, then it has negative variance

If a positive number is the magnitude of the covariance, the covariance is positive.

In this problem, the Y values increase as X values keeps increasing. The output is:

It is a positive value, 0.36, hence the relationship is such that Y and X tends to move in the same directions, i.e, as X increases, Y also increases.

The plot of X & Y values give us a better idea:

It is clear that as X is increasing, so as the values of Y. Moreover, the visulaization clearly indicates that the relationship is linear in nature, as evident from the value of the correlation coeffiecient of X & Y which is having the value 1.

The R-code for the above plot is:

plot(X,Y,main="Plot of values of X & Y", xlab="x-val", tlab="y-val")

(Answer)


Related Solutions

c) Create a scenario for two variables that may be related. Identify a sample dataset (sample...
c) Create a scenario for two variables that may be related. Identify a sample dataset (sample size n=10) and using it calculate covariance and correlation values. Interpret the relationship.
The following dataset contains a random sample of countries. Two variables are included: GDP per capita...
The following dataset contains a random sample of countries. Two variables are included: GDP per capita and infant mortality rate per 1,000 live births. Determine the equation of the best fit line and calculate the r-squared. Interpret all findings. If you do not show your work for obtaining each portion of the regression equation and r-squared, you will lose extensive points on this exercise. Country GDP per Capita (USD) Infant Mortality Rate Malaysia 9766.166 6 Slovak Republic 15962.57 5.8 Central...
The dataset Bravman.xlsx reports some variables of a sample of transactions for a company. Is there...
The dataset Bravman.xlsx reports some variables of a sample of transactions for a company. Is there evidence at the 1% significance level, that the percentage of those with bad credit (below a score 650) is more than 20% of the population?  Use the 5-step: State the null and alternative hypothesis, state the level of  significance. identify a test statistics,  determine  the rejection region and state your conclusion . . Interpret the results in context. Data: Customer Number Wait Time (min) Purchase Amount ($) Customer...
Below are the values for two variables x and y obtained from a sample of size...
Below are the values for two variables x and y obtained from a sample of size 5. We want to build a regression equation based the sample data. ŷ = b₀ + b₁x y x 16 5 21 10 8 6 28 12 53 14 11 On average the observed y deviate from the predicted y by, a 10.73 b 10.04 c 9.53 d 8.76 12 Sum of squares total (SST) is, a 1096.3 b 1178.8 c 1296.7 d 1361.5...
Below are the values for two variables x and y obtained from a sample of size...
Below are the values for two variables x and y obtained from a sample of size 5. We want to build a regression equation based the sample data. ŷ = b₀ + b₁x y x 16 5 21 10 8 6 28 12 53 14 1 The sum product of x and y is, a 1416 b 1451 c 1466 d 1481 2 The value in the numerator of the formula to compute the slope of the regression equation is,...
Identify the possible values of each of the 3 variables in this dataset and describe what...
Identify the possible values of each of the 3 variables in this dataset and describe what information each of the 3 variables tells us about the data Heart rate before and after exercise M=0 F=1 Resting After Exercise 0 85.9 87.5 0 67.7 79.4 0 80.3 93.4 0 85.2 97.7 0 86.3 99.7 0 76.6 83.7
Identify two types of variables and four scales of measurement. How they are related with each...
Identify two types of variables and four scales of measurement. How they are related with each other? What is the difference between variables and constants? Why do we need to study variables instead of constants in statistics? (Business Statistics & Application Course)
Think of a problem dealing with two possibly related variables (Y and X) that you may...
Think of a problem dealing with two possibly related variables (Y and X) that you may be interested in. Share your problem and discuss why a regression analysis could be appropriate for this problem. Specifically, what statistical questions are you asking? Why would you want to predict the value of Y? What if you wanted to predict a value of Y that’s beyond the highest value of X (for example if X is time and you want to forecast Y...
Think of a problem dealing with two possibly related variables (Y and X) that you may...
Think of a problem dealing with two possibly related variables (Y and X) that you may be interested in. Share your problem and discuss why a regression analysis could be appropriate for this problem. Specifically, what statistical questions are you asking? Why would you want to predict the value of Y? What if you wanted to predict a value of Y that’s beyond the highest value of X (for example if X is time and you want to forecast Y...
Define power. How are sample size and power related? How are effect size and power related?
Define power. How are sample size and power related? How are effect size and power related?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT