Question

In: Math

The file banking.txt attached to this assignment provides data acquired from banking and census records for...

The file banking.txt attached to this assignment provides data acquired from banking and census records
for different zip codes in the bank’s current market. Such information can be useful in targeting
advertising for new customers or for choosing locations for branch offices. The data show
- median age of the population (AGE)
- median income (INCOME) in $
- average bank balance (BALANCE) in $
- median years of education (EDUCATION)
Use r

Use R to fit a regression model to predict balance from age, education and income. Analyze the
model parameters. Which predictors have a significant effect on balance? Use the t-tests on the
parameters for alpha=.05. [2 pts = 1 pt R code + 1 pt answer]
f) If one of the predictors is not significant, remove it from the model and refit the new regression
model. Write the expression of the fitted regression model. [2 pts = 1 pt R code + 1 pt answer]
g) Interpret the value of the parameters for the variables in the model. [1 pt]
h) Report the value for the R2
coefficient and describe what it indicates. [1 pt]
i) According to census data, the population for a certain zip code area has median age equal to 34.8
years, median education equal to 12.5 years and median income equal to $42,401.
- Use the final model computed in point (f) to compute the predicted average balance for the zip
code area. [1 pt]
- If the observed average balance for the zip code area is $21,572, what’s the model prediction
error? [1 pt]
j) Conduct a global F-test for overall model adequacy. Write down the test hypotheses and test statistic
and discuss conclusions.

Age Education Income Balance

35.9 14.8 91033 38517

37.7 13.8 86748 40618

36.8 13.8 72245 35206

35.3 13.2 70639 33434

35.3 13.2 64879 28162

34.8 13.7 75591 36708

39.3 14.4 80615 38766

36.6 13.9 76507 34811

35.7 16.1 107935 41032

40.5 15.1 82557 41742

37.9 14.2 58294 29950

43.1 15.8 88041 51107

37.7 12.9 64597 34936

36 13.1 64894 32387

40.4 16.1 61091 32150

33.8 13.6 76771 37996

36.4 13.5 55609 24672

37.7 12.8 74091 37603

36.2 12.9 53713 26785

39.1 12.7 60262 32576

39.4 16.1 111548 56569

36.1 12.8 48600 26144

35.3 12.7 51419 24558

37.5 12.8 51182 23584

34.4 12.8 60753 26773

33.7 13.8 64601 27877

40.4 13.2 62164 28507

38.9 12.7 46607 27096

34.3 12.7 61446 28018

38.7 12.8 62024 31283

33.4 12.6 54986 24671

35 12.7 48182 25280

38.1 12.7 47388 24890

34.9 12.5 55273 26114

36.1 12.9 53892 27570

32.7 12.6 47923 20826

37.1 12.5 46176 23858

23.5 13.6 33088 20834

38 13.6 53890 26542

33.6 12.7 57390 27396

41.7 13 48439 31054

36.6 14.1 56803 29198

34.9 12.4 52392 24650

36.7 12.8 48631 23610

38.4 12.5 52500 29706

34.8 12.5 42401 21572

33.6 12.7 64792 32677

37 14.1 59842 29347

34.4 12.7 65625 29127

37.2 12.5 54044 27753

35.7 12.6 39707 21345

37.8 12.9 45286 28174

35.6 12.8 37784 19125

35.7 12.4 52284 29763

34.3 12.4 42944 22275

39.8 13.4 46036 27005

36.2 12.3 50357 24076

35.1 12.3 45521 23293

35.6 16.1 30418 16854

40.7 12.7 52500 28867

33.5 12.5 41795 21556

37.5 12.5 66667 31758

37.6 12.9 38596 17939

39.1 12.6 44286 22579

33.1 12.2 37287 19343

36.4 12.9 38184 21534

37.3 12.5 47119 22357

38.7 13.6 44520 25276

36.9 12.7 52838 23077

32.7 12.3 34688 20082

36.1 12.4 31770 15912

39.5 12.8 32994 21145

36.5 12.3 33891 18340

32.9 12.4 37813 19196

29.9 12.3 46528 21798

32.1 12.3 30319 13677

36.1 13.3 36492 20572

35.9 12.4 51818 26242

32.7 12.2 35625 17077

37.2 12.6 36789 20020

38.8 12.3 42750 25385

37.5 13 30412 20463

36.4 12.5 37083 21670

42.4 12.6 31563 15961

19.5 16.1 15395 5956

30.5 12.8 21433 11380

33.2 12.3 31250 18959

36.7 12.5 31344 16100

32.4 12.6 29733 14620

36.5 12.4 41607 22340

33.9 12.1 32813 26405

29.6 12.1 29375 13693

37.5 11.1 34896 20586

34 12.6 20578 14095

28.7 12.1 32574 14393

36.1 12.2 30589 16352

30.6 12.3 26565 17410

22.8 12.3 16590 10436

30.3 12.2 9354 9904

22 12 14115 9071

30.8 11.9 17992 10679

35.1 11 7741 6207

Expert Solution

e) Following is the R code used for building regression model and its output

----------R command ---------

data <- read.table("banking.txt", header=T)
model <- lm(Balance~Age+Education+Income, data=data);
summary(model);

---------- Output ---------

Call:
lm(formula = Balance ~ Age + Education + Income, data = data)

Residuals:
Min 1Q Median 3Q Max
-7722.0 -1547.4 -56.1 1167.9 8480.2

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -9.540e+03 4.423e+03 -2.157 0.0335 *
Age 3.325e+02 7.234e+01 4.597 1.28e-05 ***
Education 2.887e+02 3.005e+02 0.960 0.3392
Income 3.871e-01 1.748e-02 22.137 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2458 on 98 degrees of freedom
Multiple R-squared: 0.9225, Adjusted R-squared: 0.9201
F-statistic: 388.8 on 3 and 98 DF, p-value: < 2.2e-16

-----------

Now, observing about output, the model can be written as -

Looking at the output from R, the parameters $$ \beta_{0} = -9540$, \beta_{1} = 332.5$ \text{ and } \beta{3} = 0.3871 \text{ are significant while } \beta_{2} = 288.7 \text{ is observed to be insignificant as per p-values of t test reported in the output.}$

-------------------------------------------------------------------

As is insignificant, we can remove the associated variable and build new model as -

----------R command ---------
model <- lm(Balance~Age+Income, data=data);
summary(model);

----------Output ---------

Call:
lm(formula = Balance ~ Age + Income, data = data)

Residuals:
Min 1Q Median 3Q Max
-7385.1 -1577.9 -119.2 1200.6 8362.9

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -5.912e+03 2.301e+03 -2.570 0.0117 *
Age 3.227e+02 7.159e+01 4.508 1.8e-05 ***
Income 3.966e-01 1.437e-02 27.600 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2457 on 99 degrees of freedom
Multiple R-squared: 0.9218, Adjusted R-squared: 0.9202
F-statistic: 583.2 on 2 and 99 DF, p-value: < 2.2e-16

-----

Now lets answer some raised questions -

For this part we have to use R-command predict :-

predict(model1, data.frame(Age=34.5, Education=12.5, Income=42401));
1
21950.85
>
> predict(model_final, data.frame(Age=34.5, Income=42401));
1
22038.4
>
>
> prediction_error = abs(21572 - 22038.4)
> print(prediction_error);
[1] 466.4
---------------------------

Hence, by using first model the predicted value of balance is 21950.85 while by using second (final) model the predicted value of balance is 22038.4.

If, the real balance of the given person known, the prediction error = predicted value - real value = 466.4. (for final model).

milcah answered 6 months ago

ASSIGNMENT: Write a program and use the attached file (babynames.txt) as input file, and create two...

ASSIGNMENT: Write a program and use the attached file (babynames.txt) as input file, and create two output tiles. One file listing out all boys names, and the other file listing out all girls name. CODE: (teacher gave some of the code below use it to find the answer please String B is the boy names String E is girl names) import java.io.File; import java.io.FileNotFoundException; import java.io.PrintWriter; import java.util.Scanner; /** This program reads a file with numbers, and writes the numbers...

ASSIGNMENT: Write a program and use the attached file (babynames.txt) as input file, and create two...

ASSIGNMENT: Write a program and use the attached file (babynames.txt) as input file, and create two output tiles. One file listing out all boys names, and the other file listing out all girls name. CODE: (teacher gave some of the code below use it to find the answer please String B is the boy names String E is girl names) import java.io.File; import java.io.FileNotFoundException; import java.io.PrintWriter; import java.util.Scanner; /** This program reads a file with numbers, and writes the numbers...

The data in the attached excel file comes from Consumer Reports and was collected over a...

The data in the attached excel file comes from Consumer Reports and was collected over a two-year period. It gives the average mpg over a 195-mile trip, the weight of the vehicle (pounds), engine displacement (liters), number of cylinders, horsepower, type of transmission (0 = manual, 1 = automatic), the number of gears and whether the car was foreign (1) or domestic (0). Part a: Build a model for predicting the average mpg based on the data in the attached...

In Python. The file “essay.txt” attached to this assignment includes an essay. The essay includes a...

In Python. The file “essay.txt” attached to this assignment includes an essay. The essay includes a couple of sections that are separated by two consecutive newline characters (i.e. ‘\n’) that are shown as empty lines between the sections if you open the file in a text editor like Notepad. Each section starts with a title followed by a couple of paragraphs; the title and the paragraphs are separated by a newline character. Each paragraph includes a couple of sentences that...

In this assignment you will write a PHP program that reads from a data file, creates...

In this assignment you will write a PHP program that reads from a data file, creates an associative array, sorts the data by key and displays the information in an HTML table. Create a PHP file called hw4.php that will do the following: - Display your name. - Read text data from a file. The data file is hw3.txt. The file hw3.txt will hold the following text: PQRParrot, Quagga, Raccoon DEFDo statements, Else statements, For statements GHIGeese, Hippos, If statements...

Please also go over the Excel file attached to this assignment in order to familiarize yourself...

Please also go over the Excel file attached to this assignment in order to familiarize yourself with the different ways Excel can be used to solve Time Value of Money/Dividend Discount Model problems. There are three worksheets in the Excel file. This Excel file with examples is just that: a file to show you some examples of using Excel to solve TVM /DDM problems. Do not confuse this posted Excel file with the separate Excel file you need to create...

The file medinc.mtw contains data on the median incomes (medinc) of census dissemination areas in Toronto....

The file medinc.mtw contains data on the median incomes (medinc) of census dissemination areas in Toronto. (a) Treating this set of data as the population, use Minitab to calculate the population mean and the population standard deviation for the medinc variable. Set aside all population information until parts (d) and (e). (b) Use Minitab (Calc Menu – Random Data – Sample from Columns) to draw twenty samples of size n = 40 from the Toronto medinc population. This procedure must...

Data Set The data set (attached) is a modified CSV file on all International flight departing...

Data Set The data set (attached) is a modified CSV file on all International flight departing from US Airports between January and June 2019 reported by the US Department of Transportation (https://data.transportation.gov/Aviation/International_Report_Passengers/xgub-n9bw). Each record holds a route (origin to destination) operated by an airline. This CSV file was modified to keep it simple and relatively smaller. Here is a description of each column: Column 1 – Month (1 – January, 2 – February, 3 – March, 4 – April, 5...

You have an Excel file attached to this link. You are to first transpose the data...

You have an Excel file attached to this link. You are to first transpose the data so that, instead of being in the horizontal format, it will be converted to the vertical form. Then to run a Multiple Regression and see which of the independent variables are significant and whether the overall model is significant. You also should be able to comment on the goodness of the fit. All of these require that you have thoroughly watched the video lectures...

ASSIGNMENT: Enter the hypothetical data below in SPSS to use for the assignment. The SPSS commands: 'file',...

ASSIGNMENT: Enter the hypothetical data below in SPSS to use for the assignment. The SPSS commands: 'file', 'new', 'data' will create a spreadsheet in which to enter the data below (manually). Case Control Treatment 1 5 6 2 4 7 3 5 5 4 4 6 5 5 5 6 6 6 7 5 5 8 4 6 9 5 5 10 5 10 In this experiment, all participants rated the credibility of fake news stories on a scale of 1...

Question

The file banking.txt attached to this assignment provides data acquired from banking and census records for...

Solutions

Expert Solution

Related Solutions

ASSIGNMENT: Write a program and use the attached file (babynames.txt) as input file, and create two...

ASSIGNMENT: Write a program and use the attached file (babynames.txt) as input file, and create two...

The data in the attached excel file comes from Consumer Reports and was collected over a...

In Python. The file “essay.txt” attached to this assignment includes an essay. The essay includes a...

In this assignment you will write a PHP program that reads from a data file, creates...

Please also go over the Excel file attached to this assignment in order to familiarize yourself...

The file medinc.mtw contains data on the median incomes (medinc) of census dissemination areas in Toronto....

Data Set The data set (attached) is a modified CSV file on all International flight departing...

You have an Excel file attached to this link. You are to first transpose the data...

ASSIGNMENT: Enter the hypothetical data below in SPSS to use for the assignment. The SPSS commands: 'file',...