In: Statistics and Probability
What is Bi-variate data and how is it used in statistics?
Solution: Bivariate data is the type of data where information is given on two variables which may or may not be independent to each other. The data points can be trated as paired information for a single observation. For eg: Information on hours of sleep and marks scored by a student or marks in Mathematics and marks in English.
In Statistics, Bivariate data can be used for the following purposes:
i) Find if there is any association between the variables and what is the strength of association.
ii) Prediction of a certain value of the independent variable by performing regression
iii) Testing of hypotheses as whether there lies any signifiant relationship between the two variables.
For eg: We have age of an employee as variable X and salary of the employee as variable Y. We find out the correlation coefficient between X and Y to find if they have any correlation and whether it is positive or negative. We can also draw a scatterplot which will determine if the relationship is linear or not.
Another use is linear regression, where we choose one independent and one dependent(response) variable. Using the independent variable we can predict about the unknown response variable.
We can fit linear regression model of the form y_hat = b0 + b1x where y_hat is the predicted value of the response variable, x is the given value of the explanatory variable, b0 is the y-intercept and b1 is the slope.
and
.