In: Statistics and Probability
With a new mass shooting nearly every day, it is not surprising that violent crime is on everyone’s minds. Using the FBI’s crime statistics from randomly selected U.S. cities, answer the following questions:
Violent Crime |
Unemployment Rate |
1832 |
6.0 |
795 |
5.6 |
663 |
5.3 |
1792 |
7.6 |
282 |
4.1 |
598 |
5.5 |
1169 |
6.2 |
127 |
2.6 |
75 |
3.3 |
457 |
4.3 |
Perform a correlation. What is the correlation coefficient?
If there IS a correlation, what other variables may be acting as intervening variables (e.g., causing the change in crime rate making it appear as if there is a relationship when one doesn’t exist.)?
If there is NOT a relationship, between unemployment and violent crime, can you think of any variables that might be acting as suppressors (i.e. dampening a relationship that might actually be there)?
Perform a regression to predict violent crime. Show your regression analysis.
Write out the prediction model.
What does the model tell you?
Are the model’s results accurate and reliable? How do you know?
what jumps out as a MAJOR issue with the crime data?
Correlation
The correlation coefficient value is obtained in excel using the function =CORREL(). The screenshot is shown below,
Since the correlation coefficient value is between 0.8 and 1, there is a strong positive correlation between violent crime and unemployment rate.
Intervening variables
The other intervening variables may GDP growth rate per capita and Inflation rate.
Regression Analysis
The regression model (prediction model) is defined as,
Where, dependent variable, Y = Violent Crime and independent variable, X = Unemployment Rate
Now, the regression analysis is done in excel by following steps
Step 1: Write the data values in excel. The screenshot is shown below,
Step 2: DATA > Data Analysis > Regression > OK. The screenshot is shown below,
Step 3: Select Input Y Range: 'Violent Crime' column, Input X Range: 'Unemployment Rate' column then OK. The screenshot is shown below,
The result is obtained. The screenshot is shown below,
The regression model is,
For 1% increase in unemployment rate, the number of violent crime increases by approximately 376.97.
Accuracy and Reliability
The significance F value of the regression model is 0.000687 which is less than 0.05 at 5% significance level hence we can say that the independent variable significantly fit the model.
The R square value of the model tells how well the regression model fit the data values. The R square value is 0.7815 which means the model explains the 78.15% of the variance of the data values which is a good value.
Based on above two statistic model seems to be reliable
The standard error of the estimate measure the accuracy of the prediction which tells how far a typical value will lie from the regression line. The standard error of estimate is 313.79. The smaller value of the standard error is expected.
Major issue
From the correlation analysis we can observe that as the unemployment rate increases number of violent crime also increase however this doesn't mean there is a causal relationship.