Question

In: Statistics and Probability

The file Stat8_prob3.txt contains data of 100 Tarrant County houses (in 1900) on variables such as value (VALUE)

The file Stat8_prob3.txt contains data of 100 Tarrant County houses (in 1900) on variables such as value (VALUE), size in square feet (SIZE), a physical condition index (CONDITION), and a depreciation factor (DEPRECIATION).

(a) Fit the model to predict VALUE using SIZE, CONDITION, and DEPRECIATION as the predictor variables.

(b) Plot the residuals e against the fitted values y^i. What departures from the regression model assumptions can you see?

(c) If any of the assumptions in part (b) have been violated, suggest a possible transformation. (Hint: Apply the Box-Cox transformation.)

(d) Fit the new model and report the new fitted regression line.

(e) Calculate the new residuals and fitted values and use the appropriate plot(s) to comment on whether the assumption of part (b) is now satisfied.

Here is the data Stat8_prob3.txt:

"VALUE","SIZE","DEPRECIATION","CONDITION"
23974,1442,.4,0
24087,1426,.4,0
16781,1632,.5,0
29061,910,.5,.18
37982,972,.55,.18
29433,912,.55,.18
33624,1400,.45,.05
27032,1087,.45,.18
28653,1139,.45,.18
33075,1386,.55,.05
17474,756,.5,.05
33852,1044,.5,.07
29046,1032,.5,0
20715,720,.55,0
19461,734,.5,0
21377,720,.5,0
52881,1635,.6,.02
43889,1381,.55,.02
45134,1372,.55,.02
47655,1349,.6,.02
53088,1599,.6,.02
38923,1171,.5,.02
57870,1966,.55,.02
30489,1504,.45,0
29207,1296,.35,0
44919,1356,.55,.12
48090,1553,.55,.1
40521,1142,.55,.1
43403,1268,.55,.1
38112,1008,.55,.1
27710,1120,.5,0
27621,960,.6,0
22258,920,.35,0
29064,1259,.5,0
12001,783,.4,0
37650,1874,.35,.02
27930,1242,.5,0
16066,772,.4,0
20411,908,.45,0
23672,1155,.45,0
24215,1004,.5,0
22020,958,.45,0
52863,1828,.6,.02
41822,1146,.6,.02
45104,1368,.6,.02
28154,1392,.65,.24
20943,1058,.65,.24
17851,1375,.55,.26
16616,648,.4,.06
38752,1313,.5,0
44377,1780,.55,0
43566,1148,.55,.32
38950,1363,.55,.32
44633,1262,.55,.32
12372,840,.35,0
12148,840,.4,0
19852,839,.5,0
20012,852,.55,0
20314,852,.55,0
22814,974,.55,0
24696,1135,.5,0
23443,1170,.7,.02
35904,960,.5,0
21799,1052,.5,0
28212,1296,.55,0
27553,1282,.55,0
15826,916,.35,0
18660,864,.5,0
21536,1404,.4,0
24147,1676,.4,0
17867,1131,.4,0
21583,1397,.4,0
15482,888,.4,0
24857,1448,.45,0
17716,1022,.45,0
224182,2251,.75,.04
182012,1126,.55,.04
201597,2617,.9,.03
49683,966,.6,.05
60647,1469,.65,.05
49024,1322,.7,.02
52092,1509,.65,.02
55645,1724,.65,.04
51919,1559,.65,.02
55174,2133,.55,0
48760,1233,.55,0
45906,1323,.55,0
52013,1733,.55,0
56612,1357,.6,0
69197,1234,.6,.17
84416,1434,.6,.15
60962,1384,.55,.17
47359,995,.55,.05
56302,1372,.65,.14
88285,1774,.7,.06
91862,1903,.7,.08
242690,3581,.8,.07
296251,4343,.8,.04
107132,1861,.75,.08
77797,1542,.65,.3

Expert Solution

## first R code

## now output

orchestra answered 3 years ago

[In Python] Write a program that takes a .txt file as input. This .txt file contains...

[In Python] Write a program that takes a .txt file as input. This .txt file contains 10,000 points (i.e 10,000 lines) with three co-ordinates (x,y,z) each. From this input, use relevant libraries and compute the convex hull. Now, using all the points of the newly constructed convex hull, find the 50 points that are furthest away from each other, hence giving us an evenly distributed set of points.

The variables in the file are Price -Average selling price of houses Location -A code to...

The variables in the file are Price -Average selling price of houses Location -A code to indicate the location of the house Condition -A code to indicate the physical condition of the house Bedrooms Number of bedrooms in the house Bathrooms Number of bathrooms in the house Other Rooms Number of other rooms in the house (a) Run a regression of Price on Location, Condition, Bedrooms, Bathrooms and Other Rooms. Please attach your Excel file. (b) What variables seem to...

Python: The file, Program11.txt, on the I: drive contains a chronological list of the World Series’...

Python: The file, Program11.txt, on the I: drive contains a chronological list of the World Series’ winning teams from 1903 through 2018. The first line in the file is the name of the team that won in 1903, and the last line is the name of the team that won in 2018. (Note that the World Series was not played in 1904 or 1994. There are no entries in the file indicating this.) Write a program that reads this file...

The file CO2.txt, found on Blackboard with this assignment, contains 50 numbers, which represent the concentration...

The file CO2.txt, found on Blackboard with this assignment, contains 50 numbers, which represent the concentration of atmospheric carbon dioxide (parts per million) recorded at Mauna Loa, HI. The data in the file are the CO2 values on May 15th of each year from 1961 through 2010, (with background level CO2 removed). Fit an exponential model. Use the model to predict the CO2 value for May 15, 2015. Print the result to the screen using fprintf. The actual value was...

I have a C problem. I need to read the data of a txt file to...

I have a C problem. I need to read the data of a txt file to array by struct, then use these data to sum or sort. The txt file and the struct like aa.txt 1 2 3 4 ***************************** struct aaa{ int num1,num2; }bbb[2]; num1 for 1,3, num2 for 2 4 I made a void readfile () function for read data. How can I pass the numbers to another function ( void sum () and void sort() ) for...

If the file circuit.txt contains the following data

Exercise 2: If the file circuit.txt contains the following data 3.0 2.1 1.5 1.1 2.6 4.1 The first column is voltage and the second column is the electric current. Write program that reads the voltages and currents then calculates the electric power (P) based on the equation: Voltage Current Power 3.0 2.1 (result) 1.5 1.1 (result) 2.6 4.1 (result) P = v * i Write your output to the file results.txt with voltage in the first, current in the second...

Please use R to solve part e and f The data file data2.txt gives a data...

Please use R to solve part e and f The data file data2.txt gives a data set with two variables x and y. The first column in the data set is just row numbers not useful for this question. (e) Use the Shapiro-Wilks test to test for Normality of the data. State your null and alternative hypotheses, p-value and conclusion. Use α = 0.05 (f) Apply the transformation y 0 = log(y) and run the regression on y 0 on...

(Write/read data) Write a Program in BlueJ to create a file name Excersise12_15.txt if it does...

(Write/read data) Write a Program in BlueJ to create a file name Excersise12_15.txt if it does not exist. Write 100 integers created randomly into the file using text I/O. Integers are separated by spaces in the file. Read data back from the file and display the data in increasing order. After writing the file to disk, the input file should be read into an array, sorted using the static Arrays.sort() method from the Java API and then displayed in the...

IN PYTHON File Data --- In file1.txt add the following numbers, each on its own line...

IN PYTHON File Data --- In file1.txt add the following numbers, each on its own line (20, 30, 40, 50, 60). Do not add data to file2.txt. Write a program. Create a new .py file that reads in the data from file1 and adds all together. Then output the sum to file2.txt. Add your name to the first line in file2.txt (see sample output) Sample Output Your Name 200 use a main function.

Import a data set (txt file) then do the sorting algorithm using bubble sort, radix sort,...

Import a data set (txt file) then do the sorting algorithm using bubble sort, radix sort, insertion sort, and merge sort, It must show how long it took and how many movements occurred. Please write codes in C++ Here's data set (should be stored in txt file) 7426 4524 4737 9436 3997 2757 6288 5414 9590 5968 6638 3199 9514 1541 9866 2144 6731 911 2171 6135 6437 912 9417 2662 6606 6349 707 2890 5386 9718 3492 5068 9674...