In: Statistics and Probability
hey, how to solved the question such as "State which features are categorical". and "Which are the two most strongly correlated features? What is the numerical and/or statistical relationship between them?" for a dataset
Answer:
"State which features are categorical"
Categorical features can only take on a limited, and usually fixed, number of possible values. For example, if a dataset is about information related to users, then you will typically find features like country, gender, age group, etc. Alternatively, if the data you're working with is related to products, you will find features like product type, manufacturer, seller and so on.
Hence the variable which contains different categories as observation then this variable is categorical feature.
"Which are the two most strongly correlated features?
Main thing to find correlation is, variable must be quantitative.
For this we have to find correlation between different variables. For this make all possible pairs of all varibles and then find correlation between each pair of variables. Large value of correlation of any pair is most strongly correlated two features.
You can also find the correlation matrix in Rstudio. And the large value detects that these two features are most strongly correlated.
What is the numerical and/or statistical relationship between them?"
To know the statistical relationship between these two variables, Simply find the regression equation of this two variables. This regression will give us the statistical relationship between them.