In: Statistics and Probability
Sale Price | List Price | Days to Sell | Gulf View |
475 | 495 | 130 | 1 |
350 | 379 | 71 | 1 |
519 | 529 | 85 | 1 |
534.5 | 552.5 | 95 | 1 |
334.9 | 334.9 | 119 | 1 |
505 | 550 | 92 | 1 |
165 | 169.9 | 197 | 1 |
210 | 210 | 56 | 1 |
945 | 975 | 73 | 1 |
314 | 314 | 126 | 1 |
305 | 315 | 88 | 1 |
800 | 885 | 282 | 1 |
975 | 975 | 100 | 1 |
445 | 469 | 56 | 1 |
305 | 329 | 49 | 1 |
330 | 365 | 48 | 1 |
312 | 332 | 88 | 1 |
495 | 520 | 161 | 1 |
405 | 425 | 149 | 1 |
669 | 675 | 142 | 1 |
400 | 409 | 28 | 1 |
649 | 649 | 29 | 1 |
305 | 319 | 140 | 1 |
410 | 425 | 85 | 1 |
340 | 359 | 107 | 1 |
449 | 469 | 72 | 1 |
875 | 895 | 129 | 1 |
430 | 439 | 160 | 1 |
400 | 435 | 206 | 1 |
227 | 235 | 91 | 1 |
618 | 638 | 100 | 1 |
600 | 629 | 97 | 1 |
309 | 329 | 114 | 1 |
555 | 595 | 45 | 1 |
315 | 339 | 150 | 1 |
200 | 215 | 48 | 1 |
375 | 395 | 135 | 1 |
425 | 449 | 53 | 1 |
465 | 499 | 86 | 1 |
428.5 | 439 | 158 | 1 |
217 | 217 | 182 | 0 |
135.5 | 148 | 338 | 0 |
179 | 186.5 | 122 | 0 |
230 | 239 | 150 | 0 |
267.5 | 279 | 169 | 0 |
214 | 215 | 58 | 0 |
259 | 279 | 110 | 0 |
176.5 | 179.9 | 130 | 0 |
144.9 | 149.9 | 149 | 0 |
230 | 235 | 114 | 0 |
192 | 199.8 | 120 | 0 |
195 | 210 | 61 | 0 |
212 | 226 | 146 | 0 |
146.5 | 149.9 | 137 | 0 |
160 | 160 | 281 | 0 |
292.5 | 322 | 63 | 0 |
179 | 187.5 | 48 | 0 |
227 | 247 | 52 | 0 |
Use descriptive statistics to summarize each of the three variables for the 40 Gulf View condos. Describe the distribution of each variable. Repeat for the 18 No Gulf View Condos. (This includes the five-number summary, the mean, standard deviation, and a histogram of each variable). Gulf View condos are denoted by “1” in the Gulf View column, whereas No Gulf View condos are denoted by “0”. Prices are in thousands of dollars.
We will use R commands for the calculations. We may import the data by saving the above table by the name 'dat' and the enter the following command:
library(readr) > dat <- read_delim("Documents/r/dat", "\t", escape_double = FALSE, trim_ws = TRUE)
Note: Be sure to remove the spaces in each column names, so that the column slices would seem more structured.
The golf view condos can be sliced in the dataframe 'dat1' by the following command:
dat1 <- dat[dat$GulfView==1,] View(dat1)
Now, entering the command
summary(dat1)
would return the required results of the five number summary as below.
SalePrice (Thousand $) |
ListPrice (Thousand $) |
DaysToSell (Thousand $) |
Min. :165.0 |
Min. :169.9 |
Min. : 28.00 |
1st Qu.:314.8 |
1st Qu.:334.2 |
1st Qu.: 71.75 |
Median :417.5 |
Median :437.0 |
Median : 96.00 |
Mean :454.2 |
Mean :474.0 |
Mean :106.00 |
3rd Qu.:522.9 |
3rd Qu.:550.6 |
3rd Qu.:136.25 |
Max. :975.0 |
Max. :975.0 |
Max. :282.00 |
Standard deviation can be found as commands below:
sd(dat1$SalePrice)
192.5178
sd(dat1$ListPrice)
197.29
sd(dat1$DaysToSell)
52.21602
would give the respective standard deviations.
The histograms would be found by the commands as below:
hist(dat1$SalePrice) hist(dat1$ListPrice) hist(dat1$DaysToSell)
and the graphs would be as below.
All the distributions seems positively skewed.
_________________________________________
The non-golf view condos can be sliced in the dataframe 'dat2' by the following command:
dat2 <- dat[dat$GulfView==0,] View(dat2)
Now, entering the command
summary(dat2)
would return the required results of the five number summary as below.
SalePrice (Thousand $) |
ListPrice (Thousand $) |
DaysToSell (Thousand $) |
Min. :135.5 |
Min. :148.0 |
Min. : 48.00 |
1st Qu.:177.1 |
1st Qu.:181.6 |
1st Qu.: 74.75 |
Median :203.5 |
Median :212.5 |
Median :126.00 |
Mean :203.2 |
Mean :212.8 |
Mean :135.00 |
3rd Qu.:229.2 |
3rd Qu.:238.0 |
3rd Qu.:149.75 |
Max. :292.5 |
Max. :322.0 |
Max. :338.00 |
Standard deviation can be found as commands below:
sd(dat2$SalePrice)
43.89172
sd(dat2$ListPrice)
48.94528
sd(dat2$DaysToSell)
76.29972
would give the respective standard deviations.
The histograms would be found by the commands as below:
hist(dat2$SalePrice) hist(dat2$ListPrice) hist(dat2$DaysToSell)
and the graphs would be as below.
The sale price and list price data is quite normally distributed, while the days to sell data is a bit positively skewed.