Question

In: Statistics and Probability

13a) Compute z-scores for the Sale Price variable. Do you note any outliers? 13b) Is there...

13a) Compute z-scores for the Sale Price variable. Do you note any outliers?

13b) Is there a relationship between Lot Size and the home's Age in years? What test do you perform and why? Now check for whether there is a difference in Lot Size for older versus younger homes (using a cutoff that makes sense). What test do you perform and why?

Home ID Sale Price Lot Size Age Central Air Living Area Full Baths Half Baths Bedrooms Fireplaces
1 320000 0.61 6 Yes 2492 2 1 4 2
2 215000 0.63 21 Yes 1792 1 1 3 0
3 125000 0.35 22 No 1040 1 0 3 0
4 158900 0.21 3 No 1292 2 0 3 1
5 82000 0.17 21 No 1412 2 1 2 0
6 219200 1.08 30 No 1735 1 1 4 1
7 125000 0.16 133 No 852 1 1 2 0
8 110000 0.15 18 No 988 1 1 2 0
9 179000 0.6 29 No 2128 1 1 3 1
10 264900 0.55 16 Yes 1897 2 1 4 1
11 208000 0.12 3 No 1242 2 0 2 0
12 126000 0.74 47 No 1200 1 0 3 1
13 164700 0.16 25 Yes 1602 1 1 2 1
14 339000 0.68 38 No 2132 1 0 3 0
15 150000 0.08 68 No 1392 1 0 3 0

Solutions

Expert Solution

13a)

In this problem we are given with Sale price for home and we need to compute Zscore for it.

Formula for Z score is given by

First we will compute mean and standard deviation

For Sale price variable Mean

Mean =

=185766.7

For Sale price variable standard deviation =

  

Standard deviation = 73107.45

So we will compute Z score for sale price

Home ID Sale Price Zscore
1 320000 1.8361
2 215000 0.3999
3 125000 -0.8312
4 158900 -0.3675
5 82000 -1.4194
6 219000 0.4546
7 125000 -0.8312
8 110000 -1.0364
9 179000 -0.0926
10 264900 1.0824
11 208000 0.3041
12 126000 -0.8175
13 164700 -0.2882
14 339000 2.0960
15 150000 -0.4892

We do not observe as such outliers in data set. All observations are some what close to each other.

When Z-score value is within 3 standard deviation value then there is no outlier in the data. If Z score value exceed value of 3 then those values can be treated as outliers.

b) To find relationship between lot's size and Home's Age, we will find correlation between two variables

X = lot size, y = Home's age

So correlation can be computed by using this formula

r = -0.1418

we can observe that there is very poor relationship between two variables.

We will search with this help of scatter plot also

in scatter plot, we can observe that points are scattered randomly. Very weak relationship between lot size and Home's age.

To test relationship between two variables, we can conduct testing of hypothesis for correlation coeffcient. We will conduct this test because it tells about linear relationship between two variables. We can decide that correlation coeffcient is zero ( there is no relationship between variables). or correlation exists between variables ( There is relationship between two variables)

Here we will write Hypothesis as

Null Hypothesis : Correlation = 0 vs

Alternative Hypothesis : Correlation 0

Here we will write test statistic as

test statistic = t =

=   

= -0.5168

Here degrees of freedom = n-2 = 15-2 = 13

The critical value at 0.05 level of significance associated with the df = 13 for t is -2.532 or +2.532

Now decision rule is Test statistics > Critical value then reject null hypothesis

Here -0.516 < -2.532 then we fail to reject null hypothesis.

So we get conclusion as correlation as 0.

So we can conclude that may be there is no relationship between lot size and Home's Age.

Further we need to check difference between lot size in youger homes and older homes.

We will create cut off of 25 years.

Before age 25, homes can be treated as younger homes.

After age 25, homes can be treated as older homes.

Hence we will get two groups of lot sizes as

lot size lot size
for younger for older
homes homes
0.12 0.16
0.21 0.6
0.61 1.08
0.55 0.68
0.15 0.74
0.17 0.08
0.63 0.16
0.35

However, looking at the data we observe that there is no difference in lot size of younger homes and lot size of older homes.

To check further here we can perform two sample independent test of hypothesis for difference in sample means. Here we need to check difference in two sample of lot sizes depedent on age. So we can test that difference in two lot sizes is significant or not.

Null hypothesis : There is no difference in mean of lot size for younger homes and mean of lot size of older homes.

Alternative hypothesis : There is difference in mean of lot size for younger homes and mean of lot size of older homes.

Test statistics can be written as

So here wil help of given data

younger older
homes homes
Mean 0.3487 0.5
Variance 0.0472 0.1406
Observations 8 7

Hence Sp^2 = pooled variance = 0.0903

Hence we will compute test statistic , t as

Test statistics = -0.9721

Here degrees of freedom = n1 + n2  -2 = 8+7-2 = 13

At 0.05 level of signficance using t table, df = 13 then critical value is -2.16 or +2.16

decision rule : T statistic > Critical value then reject Null hypothesis

-0.9721 < -2.16 hence we fail to reject null hypothesis.

Hence we accept null hypothesis

We can conclude that may be there is no difference in lot size of younger homes and lot size of older homes.


Related Solutions

Please find if there are any outliers in this data set using the Z-score.
Miles 40001 53402 53500 59817 59902 63436 64090 64342 64544 65605 66998 67998 69568 69922 71978 72069 73341 74276 74425 77098 77202 77437 77539 79294 82256 82464 85092 85288 85586 85861 86813 88798 89323 89341 89641 92609 92857 94219 95066 95774 97831 101769 102534 105662 109465 116269 116803 118444 121352 138114 Please find if there are any outliers in this data set using the Z-score. This must be done in excel so please show the formulas used!
Do you notice any potential outliers? If so, what values are they? Show your work in...
Do you notice any potential outliers? If so, what values are they? Show your work in how you used the potential outlier formula to determine whether or not the values might be outliers. Construct a box plot displaying your data. Does the middle 50% of the data appear to be concentrated together or spread apart? Explain how you determined this. Looking at both the histogram and the box plot, discuss the distribution of your data. # of pencils Frequency Culumative...
Use the z-score table to answer the question. Note: Round z-scores to the nearest hundredth and...
Use the z-score table to answer the question. Note: Round z-scores to the nearest hundredth and then find the required A values using the table. A psychologist finds that the intelligence quotients of a group of patients are normally distributed, with a mean of 101 and a standard deviation of 16. Find the percent of the patients with the following IQs. (a) above 113 % (b) between 89 and 119 %
How do outliers affect PC scores? Perform a PCA on the board stiffness dataset with and...
How do outliers affect PC scores? Perform a PCA on the board stiffness dataset with and without detected outliers.
Given that z is a standard normal random variable, compute the following probabilities. P(z ≤ -0.71)...
Given that z is a standard normal random variable, compute the following probabilities. P(z ≤ -0.71) P(z ≤ 1.82) P(z ≥ -0.71) P(z ≥ 1.22) P( –1.71 ≤ z ≤ 2.88) P( 0.56 ≤ z ≤ 1.07) P( –1.65 ≤ z ≤ –1.65) Given that z is a standard normal random variable, find z, for each situation. The area to the left of z is 0.9608 The area to the right of z is .0102 The area between o and...
Suppose Z denotes the standard normal random variable. Compute the following a) P (Z > 2)...
Suppose Z denotes the standard normal random variable. Compute the following a) P (Z > 2) b) P(-1 < Z < 2.31) c) P (1.21 < Z < 2.42) d) P (Z < 1.37) e) P (Z > -1) Show working
3. A) Given that z is a standard normal random variable, compute the probability that it...
3. A) Given that z is a standard normal random variable, compute the probability that it takes on a value between -2 and -1. 3. B). Given that z is a standard normal random variable, find the z-score for a situation where the area to the right of z is 0.0901.
Given that z is a standard normal random variable, compute the probability that it takes on...
Given that z is a standard normal random variable, compute the probability that it takes on a value between -2 and -1.
Experiment or an observational, if it is an experiment, identify the independent variable and note any...
Experiment or an observational, if it is an experiment, identify the independent variable and note any possible confounding variables. A social scientist wishes to determine whether there is a relationship between the attractiveness scores (on a 100-point scale) assigned to college students by a panel of peers and their scores on a paper-and-pencil test of anxiety.
Practicing z-scores. You have a variable distributed N(2,8). What is the range of values within which...
Practicing z-scores. You have a variable distributed N(2,8). What is the range of values within which 68% of my observations fall? You have a variable distributed N(2,8). What is the z-score of the value 2? You have a variable distributed N(2,8). What is the z-score of the value 8? What percent of the data are to the left of 2? What percent of the data are to the left of 8? What percent of the data are between 2 and...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT