This case study concerns a bank's efforts to calculate credit risk scores (These are opposite of credit scores. Higher the value the riskier the customer) A loan officer at a bank needs to be able to identify characteristics that are indicative of people who are likely to default on loans and use those characteristics to identify good and bad credit risks. The loan officer also needs to be able to better quantify an individual’s credit risk level.
Information on 700 past customers is given in the file along with data for the following variables:
Age: customer age in years.
Employment: years that the customer has been with his/her current employer
Address: years that the customer has lived at his/her current address
Income: household annual income (in $1,000)
Debt _to _Income: debt to income ratio (x100)
Risk _Score: Credit risk score (the higher the score, the more risky)
Default _Indicator: an indicator of whether the customer had previously defaulted
Test the hypothesis that the average credit risk score of customers who have previously defaulted on their loans is higher than those who haven’t defaulted on their loans. (Divide the credit risk score into two groups those who previously defaulted and those who didn’t and then compare the two groups.)
You would like to build a regression model to predict the credit risk score based on all the other variables. What is the regression equation?
What is the predicted credit risk score of a customer aged 40, who has been with their employer for 10 years, has lived at the same address for 3 years, has an income of $200,000, debt to income ratio of 2.5 and has previously defaulted on a loan. Will the prediction be accurate? Explain
Compute the coefficient of determination (R2) and fully interpret its meaning.
Interpret the meaning of the coefficients of the independent variables income and address.
Which independent variables significant? Defend your answer.
Age in years | Years with current employer | Years at current address | Household income in thousands | Debt to income ratio (x100) | Previously Defaulted | Credit Risk Score |
41 | 17 | 12 | 176 | 9.3 | 1 | 808.3943274 |
27 | 10 | 6 | 31 | 17.3 | 0 | 198.2974762 |
40 | 15 | 14 | 55 | 5.5 | 0 | 10.0361081 |
41 | 15 | 14 | 120 | 2.9 | 0 | 22.13828376 |
24 | 2 | 0 | 28 | 17.3 | 1 | 781.5883142 |
41 | 5 | 5 | 25 | 10.2 | 0 | 216.7089415 |
39 | 20 | 9 | 67 | 30.6 | 0 | 185.9601084 |
43 | 12 | 11 | 38 | 3.6 | 0 | 14.70865349 |
24 | 3 | 4 | 19 | 24.4 | 1 | 748.0412036 |
36 | 0 | 13 | 25 | 19.7 | 0 | 815.0570131 |
27 | 0 | 1 | 16 | 1.7 | 0 | 350.309226 |
25 | 4 | 0 | 23 | 5.2 | 0 | 239.0539023 |
52 | 24 | 14 | 64 | 10 | 0 | 9.790173473 |
37 | 6 | 9 | 29 | 16.3 | 0 | 364.4940475 |
48 | 22 | 15 | 100 | 9.1 | 0 | 11.87390385 |
36 | 9 | 6 | 49 | 8.6 | 1 | 96.70407786 |
36 | 13 | 6 | 41 | 16.4 | 1 | 212.0503906 |
43 | 23 | 19 | 72 | 7.6 | 0 | 1.404870603 |
39 | 6 | 9 | 61 | 5.7 | 0 | 104.1453903 |
41 | 0 | 21 | 26 | 1.7 | 0 | 91.9180135 |
39 | 22 | 3 | 52 | 3.2 | 0 | 4.373536462 |
47 | 17 | 21 | 43 | 5.6 | 0 | 3.047352362 |
28 | 3 | 6 | 26 | 10 | 0 | 293.9321797 |
29 | 8 | 6 | 27 | 9.8 | 0 | 106.7996198 |
21 | 1 | 2 | 16 | 18 | 1 | 629.7774553 |
25 | 0 | 2 | 32 | 17.6 | 0 | 861.3134014 |
45 | 9 | 26 | 69 | 6.7 | 0 | 16.46115799 |
43 | 25 | 21 | 64 | 16.7 | 0 | 1.437993467 |
33 | 12 | 8 | 58 | 18.4 | 0 | 276.7066755 |
26 | 2 | 1 | 37 | 14.2 | 0 | 503.3218674 |
45 | 3 | 15 | 20 | 2.1 | 0 | 76.41958523 |
30 | 1 | 10 | 22 | 10.5 | 0 | 433.6994251 |
27 | 2 | 7 | 26 | 6 | 0 | 288.7388759 |
25 | 8 | 4 | 27 | 14.4 | 0 | 231.1006843 |
25 | 8 | 1 | 35 | 2.9 | 0 | 74.95719559 |
26 | 6 | 7 | 45 | 26 | 0 | 950.0535168 |
30 | 10 | 4 | 22 | 16.1 | 0 | 211.9564036 |
32 | 12 | 1 | 54 | 14.4 | 0 | 335.9969153 |
28 | 1 | 8 | 24 | 17.1 | 1 | 643.9032953 |
45 | 23 | 5 | 50 | 4.2 | 0 | 2.268753579 |
23 | 7 | 2 | 31 | 6.6 | 0 | 132.8782071 |
34 | 17 | 3 | 59 | 8 | 0 | 31.76854323 |
42 | 7 | 23 | 41 | 4.6 | 0 | 31.90347933 |
39 | 19 | 5 | 48 | 13.1 | 0 | 28.07933138 |
26 | 0 | 0 | 14 | 7.5 | 1 | 511.04996 |
21 | 0 | 1 | 16 | 6.8 | 0 | 453.6168743 |
35 | 13 | 15 | 35 | 4.5 | 0 | 10.78188877 |
47 | 4 | 2 | 26 | 10.4 | 0 | 281.6573725 |
23 | 0 | 2 | 21 | 11.4 | 1 | 621.7847698 |
In: Math
Two plots at Rothamsted Experimental Station were studied for production of wheat straw. For a random sample of years, the annual wheat straw production (in pounds) from one plot was as follows.
5.70 | 7.03 | 5.84 | 7.03 | 7.31 | 7.18 |
7.06 | 5.79 | 6.24 | 5.91 | 6.14 |
Use a calculator to verify that, for this plot, the sample
variance is s2 ≈ 0.411.
Another random sample of years for a second plot gave the following
annual wheat production (in pounds).
6.19 | 7.59 | 7.66 | 8.15 | 7.22 | 5.58 | 5.47 | 5.86 |
Use a calculator to verify that the sample variance for this
plot is s2 ≈ 1.117.
Test the claim that there is a difference (either way) in
the population variance of wheat straw production for these two
plots. Use a 5% level of signifcance.
(a) What is the level of significance?
State the null and alternate hypotheses.
Ho: σ12 = σ22; H1: σ12 > σ22
Ho: σ12 > σ22; H1: σ12 = σ22
Ho: σ22 = σ12; H1: σ22 > σ12
Ho: σ12 = σ22; H1: σ12 ≠ σ22
(b) Find the value of the sample F statistic. (Use
2 decimal places.)
What are the degrees of freedom?
dfN | |
dfD |
What assumptions are you making about the original distribution?
The populations follow dependent normal distributions. We have random samples from each population.
The populations follow independent normal distributions.
The populations follow independent chi-square distributions. We have random samples from each population.
The populations follow independent normal distributions. We have random samples from each population.
(c) Find or estimate the P-value of the sample
test statistic. (Use 4 decimal places.)
p-value > 0.200
0.100 < p-value < 0.200
0.050 < p-value < 0.100
0.020 < p-value < 0.050
0.002 < p-value < 0.020
p-value < 0.002
(d) Based on your answers in parts (a) to (c), will you
reject or fail to reject the null hypothesis?
At the α = 0.05 level, we reject the null hypothesis and conclude the data are not statistically significant.
At the α = 0.05 level, we reject the null hypothesis and conclude the data are statistically significant.
At the α = 0.05 level, we fail to reject the null hypothesis and conclude the data are not statistically significant.
At the α = 0.05 level, we fail to reject the null hypothesis and conclude the data are statistically significant.
(e) Interpret your conclusion in the context of the
application.
Fail to reject the null hypothesis, there is sufficient evidence that the variance in annual wheat production differs between the two plots.
Reject the null hypothesis, there is insufficient evidence that the variance in annual wheat production differs between the two plots.
Reject the null hypothesis, there is sufficient evidence that the variance in annual wheat production differs between the two plots.
Fail to reject the null hypothesis, there is insufficient evidence that the variance in annual wheat production differs between the two plots.
In: Math
In: Math
A mixture of pulverized fuel ash and Portland cement to be used for grouting should have a compressive strength of more than 1300 KN/m2. The mixture will not be used unless experimental evidence indicates conclusively that the strength specification has been met. Suppose compressive strength for specimens of this mixture is normally distributed with σ = 66. Let μ denote the true average compressive strength.
(a) What are the appropriate null and alternative hypotheses?
H0: μ = 1300
Ha: μ > 1300H0:
μ > 1300
Ha: μ =
1300 H0: μ =
1300
Ha: μ ≠ 1300H0:
μ < 1300
Ha: μ = 1300H0:
μ = 1300
Ha: μ < 1300
(b) Let
X
denote the sample average compressive strength for n = 14 randomly selected specimens. Consider the test procedure with test statistic
X
itself (not standardized). What is the probability distribution of the test statistic when H0 is true?
The test statistic has a gamma distribution.The test statistic has a normal distribution. The test statistic has a binomial distribution.The test statistic has an exponential distribution.
If
X = 1340,
find the P-value. (Round your answer to four decimal
places.)
P-value =
Should H0 be rejected using a significance
level of 0.01?
reject H0do not reject H0
(c) What is the probability distribution of the test statistic when
μ = 1350?
The test statistic has an exponential distribution.The test statistic has a normal distribution. The test statistic has a gamma distribution.The test statistic has a binomial distribution.
State the mean and standard deviation of the test statistic. (Round
your standard deviation to three decimal places.)
mean | KN/m2 | |
standard deviation | KN/m2 |
For a test with α = 0.01, what is the probability that the
mixture will be judged unsatisfactory when in fact μ =
1350 (a type II error)? (Round your answer to four decimal
places.)
In: Math
Test the claim that the proportion of men who own cats is smaller than the proportion of women who own cats at the .10 significance level.
left tailed right tailed two tailed
test statistic
critical value
reject or accept the null
In: Math
A clinical trial was conducted to test the effectiveness of a drug for treating insomnia in older subjects. Before treatment,
20
subjects had a mean wake time of
101.0
min. After treatment, the
20
subjects had a mean wake time of
92.4
min and a standard deviation of
21.1
min. Assume that the
20
sample values appear to be from a normally distributed population and construct a
99%
confidence interval estimate of the mean wake time for a population with drug treatments. What does the result suggest about the mean wake time of
101.0
min before the treatment? Does the drug appear to be effective?
Construct the
99%
confidence interval estimate of the mean wake time for a population with the treatment.
_ min < u < min
The confidence interval
▼
the mean wake time of
101.0
min before the treatment, so the means before and after the treatment
▼
This result suggests that the drug treatment
▼
a significant effect.
First blank: DOES NOT INCLUDE or INCLUDE
Second blank: ARE DIFFERENT or COULD BE THE SAME
Third Blank: HAS or DOES NOT HAVE
In: Math
Why is sampling distribution a theoretical distribution?
In: Math
A staff psychologist at a police precinct developed a week-long
training course designed to improve on the job sensitivity of
police officers. The psychologist designs a study where some
policemen randomly get and complete the course. A month later the
psychologist records the number of domestic disputes the police
officers successfully resolved from their police reports. What can
the psychologist conclude with an α of 0.05. The success data are
below.
no course |
course |
---|---|
11.2 12.5 10.6 12.7 8.3 15.6 12.1 |
14.8 16.3 14.3 17.4 11.2 16.5 15.4 |
a) What is the appropriate test statistic?
---Select--- na z-test One-Sample t-test Independent-Samples t-test
Related-Samples t-test
b)
Condition 1:
---Select--- no course police precinct domestic disputes completing
the course job sensitivity
Condition 2:
---Select--- no course police precinct domestic disputes completing
the course job sensitivity
c) Compute the appropriate test statistic(s) to
make a decision about H0.
(Hint: Make sure to write down the null and alternative hypotheses
to help solve the problem.)
p-value = ; Decision: ---Select---
Reject H0 Fail to reject H0
d) , compute the corresponding effect size(s) and
indicate magnitude(s).
If not appropriate, input and/or select "na" below.
d = ; ---Select--- na trivial
effect small effect medium effect large effect
r2 = ; ---Select--- na
trivial effect small effect medium effect large effect
e) Make an interpretation based on the
results.
Participants that received training had significantly less resolved domestic disputes than those that did not receive training.
Participants that received training had significantly more resolved domestic disputes than those that did not receive training.
Participants that received training did not differ significantly on resolved domestic disputes than those that did not receive training.
In: Math
As part of a long-term study of individuals 65 yars of age or older, sociologists and physicians at the Institute of Mental Health Research in Ottawa,
investigated the relationship between geographic location and
depression. A random of sample individuals, all in reasonably good
health, were selected from Victoria, BC; Edmonton, Alberta;
Winnipeg, Manitoba; and Halifax, Nova Scotia. Each of the
individuals sampled was given a standardized test to measure
depressionThe data collected has been provided to you; a higher
test score indicates a higher level of depression. A second part of
the study considered the relationship between geographic lomtion
and depression for individuals 65 years of age or older who had a
chronic health condition such as arthritis, hypertension, and/or
heart disease. A sample of individuals, all suffering 130111 a
chronic illness, were selected fmm Victoria, BC, Edmonton, Albetta;
Winnipeg, Manitoba; and Halifax Nova Scotia. Each of the
individuals sampled was given a standardized test to measure
depression. The data collected has been provided to you; a higher
test score indicates a higher level of
depression.
1) Use the appropriate descriptive statistics to summarize the data. Provide your preliminary observations.
2) Develop a 95% confidence interval for the average depression score among healthy people and the chronically ill. Discuss your findings.
3) Perform an analysis of variance to test for any significant differences due to location for healthy people. Use a 0.05 level of significance. Comment on your fmdings.
4) Perform an analysis of variance to test for any significant differences due to location for chronically ill. Use a 0.05 level of significance. Comment on your findings.
Please provide statistics of patients regardless of their health and null and alternative hypothesis for each test.
Healthy | Chronically Ill | |||||||
Victoria | Edmonton | Winnipeg | Halifax | Victoria | Edmonton | Winnipeg | Halifax | |
6.00 | 5.54 | 7.56 | 4.38 | 5.04 | 4.40 | 12.61 | 9.03 | |
5.39 | 7.02 | 9.57 | 6.06 | 5.57 | 4.14 | 8.44 | 3.74 | |
6.47 | 7.56 | 5.87 | 5.41 | 6.76 | 5.00 | 8.67 | 1.45 | |
6.51 | 4.43 | 4.15 | 4.02 | 7.35 | 7.55 | 8.78 | 11.74 | |
1.58 | 8.28 | 10.01 | 4.90 | 10.71 | 8.75 | 6.49 | 4.77 | |
5.55 | 5.19 | 9.27 | 6.76 | 8.90 | 7.00 | 7.79 | 7.80 | |
1.32 | 6.92 | 10.29 | 8.28 | 6.90 | 7.56 | 8.61 | 6.22 | |
1.62 | 7.98 | 4.66 | 4.20 | 4.27 | 9.84 | 8.46 | 6.07 | |
5.30 | 4.23 | 6.59 | 4.80 | 5.25 | 7.76 | 6.80 | 7.85 | |
4.33 | 3.95 | 10.72 | 10.32 | 3.84 | 8.92 | 6.43 | 8.92 | |
3.58 | 3.18 | 6.28 | 6.35 | 7.51 | 10.50 | 8.26 | 4.77 | |
8.08 | 7.48 | 10.83 | 4.33 | 11.80 | 7.95 | 7.92 | 7.75 | |
7.08 | 6.56 | 5.55 | 9.58 | 14.46 | 6.41 | 10.14 | 6.05 | |
6.06 | 8.91 | 4.80 | 6.85 | 7.74 | 10.51 | 7.46 | 6.62 | |
6.46 | 4.34 | 10.48 | 3.86 | 6.12 | 12.23 | 7.77 | 0.21 | |
5.15 | 8.17 | 7.84 | 0.88 | 5.92 | 6.84 | 6.75 | 8.27 | |
4.92 | 3.88 | 6.20 | 1.91 | 10.21 | 4.85 | 11.23 | 7.30 | |
6.49 | 9.27 | 6.31 | 3.14 | 5.80 | 6.66 | 4.55 | 7.66 | |
4.89 | 2.92 | 9.60 | 3.88 | 10.01 | 8.22 | 3.76 | 6.52 | |
5.84 | 6.44 | 6.33 | 2.38 | 11.67 | 5.40 | 4.21 | 1.08 | |
7.13 | 4.81 | 6.60 | 9.83 | 0.95 | 5.94 | 11.39 | 8.24 | |
0.33 | 6.50 | 3.39 | 4.23 | 4.57 | 4.32 | 9.37 | 7.61 | |
8.03 | 7.64 | 5.37 | 1.07 | 5.15 | 3.94 | 7.19 | 9.12 | |
7.27 | 2.70 | 7.17 | 5.93 | 7.57 | 8.73 | 6.21 | 7.52 | |
2.62 | 3.50 | 3.87 | 5.93 | 8.12 | 7.08 | 8.59 | 4.90 | |
5.69 | 7.54 | 6.39 | 4.97 | 13.68 | 8.14 | 3.76 | 4.22 | |
4.30 | 6.47 | 11.04 | 5.06 | 9.28 | 11.81 | 9.76 | 9.31 | |
4.62 | 1.07 | 9.88 | 7.63 | 7.72 | 6.91 | 4.44 | 10.65 |
In: Math
In: Math
Perform a hypothesis test for population mean: Nearly 30 years ago the mean height for women 20 years old and older was 63.7 inches. A recent random sample of 45 women who are 20 years old and older had a mean of 63..9 inches. Perform a hypothesis test on the following hypotheses: Null Hypothesis - the population mean is equal to 63.7 inches and the Alternate Hypothesis - the population mean is greater than 63.7 inches. Use a level of significance of .10 or 10%. The Standard Deviation for the recent random sample of 45 women was .5 inches.
Please show work
In: Math
The following table shows ceremonial ranking and type of pottery sherd for a random sample of 434 sherds at an archaeological location.
Ceremonial Ranking | Cooking Jar Sherds | Decorated Jar Sherds (Noncooking) | Row Total |
A | 81 | 54 | 135 |
B | 94 | 51 | 145 |
C | 77 | 77 | 154 |
Column Total | 252 | 182 | 434 |
Use a chi-square test to determine if ceremonial ranking and pottery type are independent at the 0.05 level of significance.
(a) What is the level of significance?
State the null and alternate hypotheses.
H0: Ceremonial ranking and pottery type are
not independent.
H1: Ceremonial ranking and pottery type are not
independent.
H0: Ceremonial ranking and pottery type are
not independent.
H1: Ceremonial ranking and pottery type are
independent.
H0: Ceremonial ranking and
pottery type are independent.
H1: Ceremonial ranking and pottery type are not
independent.
H0: Ceremonial ranking and pottery type are
independent.
H1: Ceremonial ranking and pottery type are
independent.
(b) Find the value of the chi-square statistic for the sample.
(Round the expected frequencies to at least three decimal places.
Round the test statistic to three decimal places.)
Are all the expected frequencies greater than 5?
Yes
No
What sampling distribution will you use?
normal
uniform
binomial
Student's t
chi-square
What are the degrees of freedom?
(c) Find or estimate the P-value of the sample test
statistic. (Round your answer to three decimal places.)
p-value > 0.100
0.050 < p-value < 0.100
0.025 < p-value < 0.050
0.010 < p-value < 0.025
0.005 < p-value < 0.010
p-value < 0.005
(d) Based on your answers in parts (a) to (c), will you reject or
fail to reject the null hypothesis of independence?
Since the P-value > α, we fail to reject the null hypothesis.
Since the P-value > α, we reject the null hypothesis.
Since the P-value ≤ α, we reject the null hypothesis.
Since the P-value ≤ α, we fail to reject the null hypothesis.
(e) Interpret your conclusion in the context of the
application.
At the 5% level of significance, there is sufficient evidence to conclude that ceremonial ranking and pottery type are not independent.
At the 5% level of significance, there is insufficient evidence to conclude that ceremonial ranking and pottery type are not independent.
In: Math
A building contractor buys 70% of his cement from supplier A and 30% from supplier B. A total of 95% of the bags from A arrive undamaged, and a total of 90% of the bags from B arrive undamaged. Find the probability that a damaged bag is from supplier Upper A.
In: Math
One of the major measures of the quality of service provided by any organization is the speed with which it responds to customer complaints. A large family-held department store selling furniture and flooring, including carpet, had undergone a major expansion in the past several years. In particular, the flooring department had expanded from 2 installation crews to an installation supervisor, a measurer, and 15 installation crews. Last year, there were 50 complaints concerning carpet installation. The following data, also in the file FURNITURE, represent the number of days between the receipt of a complaint and the resolution of the complaint: 54 5 35 137 31 27 152 2 123 81 74 27 11 19 126 110 110 29 61 35 94 31 26 5 12 4 165 32 29 28 29 26 25 1 14 13 13 10 5 27 4 52 30 22 36 26 20 23 33 68 Problem 4 Please continue for problem questions…
a. Construct and interpret a 95% confidence interval estimate of the population mean number of days between the receipt of a complaint and the resolution of the complaint and interpret. Use Minitab
b. What assumption must you make about the population distribution in order to construct the confidence interval estimate in (a)?
c. Do you think that the assumption needed in order to construct the confidence interval estimate in (a) is valid? Explain.
In: Math
Contrasting the Independent and Dependent Tests
In a brief paragraph, discuss your reasoning for any differences between the two tests in the seven-step process.
In: Math