In: Statistics and Probability
Data on all residential homes sales in Ames Iowa between 2006 and 2010. The data set contains many explanatory variables on the quantity of physical attributes of residential homes in Iowa sold between 2006 and 2010. Most of the variables describe information a typical home buyer would like to know about a property (square footage, number of bedrooms, size of a lot, etc.)
Now use the Lot.Area
a. Use summary command to see descriptive statistics of Lot.Area
b. what is the mean and median of Lot.Area. Explain Comment on it.
c. Get the histogram of Lot.Area What do you observe?
d. Create new data set for Lot.Area where Lot.Area is than 20000.
e. Get the histogram of the new data for Lot.Area. What do you see? How is it distributed
(Need this in R Script commands)
> df$LotArea
[1] 8450 9600 11250 9550 14260 14115 10084 10382 6120 7420 11200
11924 12968 10652 10920 6120 11241 10791 13695 7560 14215 7449
9742
[24] 4224 8246 14230 7200 11478 16321 6324 8500 8544 11049 10552
7313 13418 10859 8532 7922 6040 8658 16905 9180 9200 7945
7658
[47] 12822 11096 4456 7742 13869 6240 8472 50271 7134 10175 2645
11645 13682 7200 13072 7200 6442 10300 9375 9591 19900 10665
4608
[70] 15593 13651 7599 10141 10200 5790 1596 8475 8635 10778 10440
13000 4500 10206 8892 8530 16059 11911 3951 8470 8070 7200
8500
[93] 13360 7200 9337 9765 10264 10921 10625 9320 10603 9206 7018
10402 7758 9375 10800 6000 8500 11751 9525 7750 9965 21000
7259
[116] 3230 11616 8536 12376 8461 21453 6060 9464 7892 17043 6780
4928 4388 7590 8973 14200 12224 7388 6853 10335 10400 10355
11070
[139] 9066 15426 10500 11645 8520 10335 9100 2522 6120 9505 7500
6240 10356 13891 14803 13500 11340 9600 7200 12003 12552 19378
11120
[162] 13688 12182 5500 5400 10106 10708 10562 8244 16669 12358
31770 5306 10197 12416 12615 10029 13650 17423 8520 2117 7588 9060
11426
[185] 7438 22950 9947 10410 7018 4923 10570 7472 9017 2522 7180
2280 9416 25419 5520 9591 8546 10125 7000 4438 3500 11851
13673
[208] 12493 14364 8250 5604 10420 8640 13568 10900 10011 8450 9906
15660 3010 8990 8068 11475 10500 13472 1680 9950 1869 8521
3182
[231] 8760 15138 1680 10650 7851 1680 8773 9453 12030 8741 9000
3880 5000 10762 8880 10400 9142 11310 11317 159000 5350 4750
8366
[254] 9350 8400 8738 8791 8814 12435 12702 19296 9588 8471 5500
5232 12090 11207 8400 6900 7917 10728 39104 11764 9600 8314
7264
[277] 9196 19138 14450 10005 11287 7200 5063 9612 8012 4251 9786
8125 9819 8730 15611 5687 11409 16659 9600 7937 13710 7399
11700
[300] 14000 15750 16226 13704 9800 18386 10386 13474 7920 12342
12378 7685 8000 7800 215245 9600 7795 13005 9000 9900 14115 16259
12099
[323] 10380 5820 11275 5000 10846 11600 11888 6402 10624 8176 10655
8198 9042 164660 14157 9135 14145 12400 14191 8400 8544 8849
2592
[346] 6435 12772 17600 2448 20431 7820 5271 9084 8520 8400 11249
9248 4224 6930 12011 7540 9144 7301 1680 18800 10690 9500
9150
[369] 7800 9830 8121 17120 7175 10634 8200 10020 8846 11143 11394
8123 5000 7200 9245 9000 53107 3182 8410 7200 9382 12474 8405
[392] 12209 8339 7446 10134 9571 7200 7590 8967 8125 14963 8767
10200 12090 10364 9991 10480 15576 14154 10800 9571 34650 4403
8960
[415] 11228 8899 7844 22420 8160 8450 7060 16635 21750 9200 9000
3378 12800 8593 6762 11457 1680 5586 1920 10839 1890 10667
4400
[438] 6000 4280 12354 15431 12108 6240 3922 8750 9855 16492 11214
8600 6000 5684 70761 9303 9000 9297 9600 4571 53227 5100 7015
[461] 8004 7200 8281 11988 8430 3072 10628 9480 11428 9291 6820
11952 3675 14977 5330 8480 13125 13693 10637 5925 16033 11846
2500
[484] 4500 7758 9600 10289 12243 10800 1526 2665 9490 15578 7931
5784 7879 12692 9120 7800 7535 1890 9803 9170 15602 2308 7596
[507] 9554 7862 9600 9600 14559 6792 9100 9187 10594 12220 10448
10208 9531 10918 10800 11988 5000 40094 11787 7500 13300 14948
9098
[530] 32668 10200 6155 7200 5000 9056 7000 8924 12735 11553 11423
14601 11000 10140 4058 17104 13837 8737 7244 8235 9375 4043
6000
[553] 11146 8777 10625 6380 14850 11040 21872 3196 11341 10010
13907 21780 13346 6858 11198 10171 12327 7032 13101 7332 13159 9967
10500
[576] 8480 6292 11777 3604 12150 14585 12704 11841 13500 6120 11443
10267 8740 25095 9100 8320 13478 6600 4435 7990 11302 3600
3922
[599] 12984 1950 10927 9000 10041 3182 12803 13600 12464 7800 12168
7943 11050 10395 11885 8402 1491 8800 7861 7227 11694 12244
8248
[622] 10800 7064 2117 10400 10000 12342 9600 11606 9020 9000 4590
11900 9250 6979 10896 6120 6000 8777 3982 12677 7050 13860
10793
[645] 9187 10530 7200 10452 7700 1936 8125 9084 8750 10320 10437
1680 10007 7200 17503 9937 12384 46589 13560 10012 20896 11194
18450
[668] 8125 14175 11600 8633 6629 11250 14442 9200 2289 9600 9022
11844 9945 8012 4500 2887 11248 16770 5062 10207 5105 8089
7577
[691] 4426 21535 26178 5400 6120 13811 6000 6420 8450 4282 14331
9600 12438 7630 8400 5600 115149 6240 9018 7162 4130 8712
4671
[714] 9873 13517 10140 10800 10000 10542 9920 6563 4426 8120 8172
13286 6960 21695 7314 11475 6240 5389 9590 11404 10000 8978
10800
[737] 8544 10463 10800 9313 9600 6768 8450 12886 5395 8963 8795
11700 10593 8405 8800 7750 9236 10240 7930 3230 10769 11616
2280
[760] 12257 9100 6911 8640 9430 9549 14587 10421 12508 9100 53504
7252 8877 7819 10150 14226 4500 11210 13350 8400 10530 7875
7153
[783] 16285 9101 6300 9790 10800 10142 6000 12205 3182 11333 9920
9158 10832 8400 8197 7677 13518 7200 12798 4800 8199 13891
9000
[806] 12274 9750 21384 13400 8100 10140 4438 8712 9750 8248 12137
11425 13265 8816 6371 7226 6000 12394 9900 11216 14803 6130
8529
[829] 28698 2544 11900 3180 9548 10004 7875 9600 8100 1680 9525
11767 12155 10440 9020 8000 12665 16647 9317 15523 45600 9600
4435
[852] 3196 7128 12095 17920 6897 10970 8125 10400 11029 7642 11625
9672 7931 8640 8750 10656 6970 14762 9938 6600 8750 8892
12144
[875] 5720 9000 25286 8834 11782 7000 7024 13758 9636 6204 7150
5119 8393 16466 15865 12160 8064 11184 8414 13284 7018 7056
8765
[898] 7018 12919 6993 7340 8712 7875 14859 6173 9920 13501 11500
8885 12589 11600 9286 6120 6270 3000 2001 9000 17140 13125
11029
[921] 8462 8777 10237 8012 10240 15611 11999 9900 11838 13006 8925
9100 11670 8487 27650 5825 10083 9675 8760 24090 12640 8755
7711
[944] 25000 14375 8820 8163 14536 14006 9360 7200 7800 7200 11075
9400 7136 1300 7420 8450 2572 7207 12227 2308 11923 11316
10237
[967] 9600 7390 5925 10382 10800 2268 7892 11639 11414 2651 5900
4274 9450 8816 12122 12203 3182 11250 10125 10880 5310 10159
12046
[990] 8125 9452 17671 9760 8846 12456 4712 10659 11717 9786 6762
10206 5400 11957 11500 3182 8385 12155 2217 12118 6000 21286
9825
[1013] 10592 7200 11664 8400 11883 5814 10784 3013 7024 7406 9439
3182 15498 7700 9300 9520 9492 1680 7082 15863 14541 8125
6305
[1036] 11500 12898 9240 1533 1477 13125 9130 5381 11839 9600 13680
16056 9245 21750 11100 8993 11175 9500 8562 11367 11361 7052
29959
[1059] 11308 11275 4920 18000 13600 6000 11000 14000 7837 9760 3964
9600 10152 11700 7585 7950 8556 13125 10800 15870 4435 8775
11040
[1082] 7500 8749 8800 13031 9069 1974 10574 2522 3316 8544 2160
8400 9230 5868 9317 6882 3696 6000 11880 8400 9758 7000 8910
[1105] 2016 12256 10357 23257 8063 11362 8000 10480 7100 8923 5400
12085 7750 9764 13825 7560 8263 10084 8926 9405 9125 10434
3684
[1128] 14572 11796 7200 7804 10712 9900 9828 8773 6180 9600 6342
9819 8731 7350 10304 9965 9000 12180 6240 11200 12000 5700
9000
[1151] 8280 17755 14115 5890 13700 10768 9350 5001 11932 9120 2280
14778 8724 12900 16157 9541 10475 10852 13728 35760 9880 9120
4017
[1174] 18030 16560 10678 6951 3950 7681 8335 11170 5587 15623 10800
35133 9738 10615 12461 8935 7500 32463 2645 9600 4500 9364
8029
[1197] 14054 8850 9100 11235 9353 10400 6000 9750 10140 14684 8900
9135 7763 10182 11218 12134 9340 10246 10205 7094 8930 8640
6240
[1220] 1680 7800 8250 10496 10680 15384 10482 14598 8872 8769 7910
18890 7728 9842 12160 8525 13132 2628 12393 13072 9037 8158
9849
[1243] 10625 13891 11435 12090 8125 12328 9600 7200 11160 3136 9858
17542 6931 6240 14303 4060 9587 9750 24682 9600 11250 13515
4060
[1266] 3735 10120 13214 14100 11344 23595 9156 13526 11512 5362
11345 12936 17871 9473 7500 9808 8049 8800 9400 9638 6000 9790
36500
[1289] 5664 11065 14112 1680 6600 10140 8172 8400 8700 3675 63887
7500 10762 7500 10120 8688 3363 13173 6955 8072 12000 7153
17500
[1312] 8814 9572 14774 8190 11075 10226 4230 14781 10215 8400 6627
10186 5330 9986 3636 4270 6600 10440 9084 10000 10780 8877
7200
[1335] 2368 9650 9246 4118 13450 9560 8294 13695 9375 7558 11103
6000 20781 15306 16196 5250 11643 9247 6000 14720 10316 10192
9477
[1358] 12537 2117 16737 9842 16158 12513 8499 3180 7500 9179 2665
4435 10635 5400 9600 9750 11400 10625 10991 6292 10998 1953
9735
[1381] 8212 12925 7200 25339 9060 5436 16692 8520 14892 6000 9100
8944 7838 10800 4045 12665 57200 6120 7200 6171 6000 7415
6762
[1404] 15256 10410 3842 8445 8780 7740 20544 12420 9600 7200 10994
13053 3635 11340 16545 9204 16381 11700 4043 4435 19690 9503
10721
[1427] 10944 10930 7200 12546 21930 4928 10800 10261 17400 8400
9000 12444 7407 11584 11526 4426 11003 8854 8500 8400 26142 10000
11767
[1450] 1533 9000 9262 3675 17217 7500 7917 13175 9042 9717 9937
Create a csv file for the above data. Put all the data in the same column named 'LotArea'. (Hint: Eleminate the '[]' terms, use find and replace option. replace space with column('). save as a csv file.)
#/******************************************************R-Script***********************************************#
df <- read.csv(file.choose())
summary(df$LotArea)
mean(df$LotArea)
median(df$LotArea)
hist(df$LotArea)
df_greater_than_20000 <- df[which(df$LotArea>20000),]
hist(df_greater_than_20000$LotArea)
df_less_than_20000 <- df[which(df$LotArea<20000),]
hist(df_less_than_20000$LotArea)
#*************************************************************Complete**************************************************#
Note: In 'd' part it is not mentioned that 'greater than' or 'less than'. I have created for both
Explanantion and Ouput:
> summary(df$LotArea)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1300 7554 9478 10517 11602 215245
>
> mean(df$LotArea)
[1] 10516.83
>
> median(df$LotArea)
[1] 9478.5
Answer:#median is less than mean, indicating positively skewed data.
>
> hist(df$LotArea)
Answer: Positively skewed verified, as the frequency is very high on the left hand side.
> df_greater_than_20000 <-
df[which(df$LotArea>20000),]
>
> hist(df_greater_than_20000$LotArea)
>
> df_less_than_20000 <-
df[which(df$LotArea<20000),]
>
> hist(df_less_than_20000$LotArea)
Answer: The data found to be symmetrical.
You can look for data on search engines as well.