Question

In: Statistics and Probability

A study investigated the relationship between audit delay (Delay), the length of time from a company's...

A study investigated the relationship between audit delay (Delay), the length of time from a company's fiscal year-end to the date of the auditor's report, and variables that describe the client and the auditor. The independent variables are as follows.

Industry A dummy variable coded 1 if the firm was an industrial company or 0 if the firm was a bank, savings and loan, or insurance company.

Public A dummy variable coded 1 if the company was traded on an organized exchange or over the counter; otherwise coded 0.

Quality A measure of overall quality of internal controls, as judged by the auditor, on a five-point scale ranging from "virtually none" (1) to "excellent" (5).

Finished A measure ranging from 1 to 4, as judged by the auditor, where 1 indicates "all work performed subsequent to year-end" and 4 indicates "most work performed prior to year-end."

A sample of 40 companies provided the following data.

Delay Industry Public Quality Finished
62 0 0 3 1
45 0 1 3 3
54 0 0 2 2
71 0 1 1 2
91 0 0 1 1
62 0 0 4 4
61 0 0 3 2
69 0 1 5 2
80 0 0 1 1
52 0 0 5 3
47 0 0 3 2
65 0 1 2 3
60 0 0 1 3
81 1 0 1 2
73 1 0 2 2
89 1 0 2 1
71 1 0 5 4
76 1 0 2 2
68 1 0 1 2
68 1 0 5 2
86 1 0 2 2
76 1 1 3 1
67 1 0 2 3
57 1 0 4 2
55 1 1 3 2
54 1 0 5 2
69 1 0 3 3
82 1 0 5 1
94 1 0 1 1
74 1 1 5 2
75 1 1 4 3
69 1 0 2 2
71 1 0 4 4
79 1 0 5 2
80 1 0 1 4
91 1 0 4 1
92 1 0 1 4
46 1 1 4 3
72 1 0 5 2
85 1 0 5 1

Enter negative values as negative, if necessary.

a. Develop the estimated regression equation using all four independent variables (to 3 decimals, if necessary). Delay = -------- + -------- Industry + ------- Public + ------- Quality + -----------Finished .

b. What is the value of the coefficient of determination (to 3 decimals)? Note: report R 2 between 0 and 1.

Did the estimated regression equation in part (a) provide a good fit?

c. Which of the following is a scatter diagram for showing Delay as a function of Finished? What does this scatter diagram indicate about the relationship between Delay and Finished?

The scatter diagram of Delay and Finishing suggests exists between these two variables. Add Finished-Squared as a fifth independent variable. Use best subsets regression procedure to answer the following question.

Which independent variables provide the best regression model if two independent variables are in the model?

Which independent variables provides the best regression model if three independent variables are in the model?

d. Using the best subset regression procedure, how many independent variables are in the highest adjusted R 2 model?

What is the value of R 2(adj) (to 1 decimal)? Note: report R 2(adj) as a percentage.

------------%

Solutions

Expert Solution

a) regression equation:- Delay = 0 +  1 industry +  2Public + 3Quality + 4 Finished

b) we will run these data on r software , as

copy data from excel and run the command on r as follow

data=read.table("clipboard",header = TRUE)
data
model=lm(Delay~.,data=data)# to find anova
model
summary(model)

we will get the result as ,

regression equation:- Delay = 80.429 + 11.944* industry + -4.816*Public + -2.624*Quality + -4.073 *Finished

then the value of the coefficient of determination = R2 = 0.3826

the estimated regression provide the poor fit.

c) scatterplot of Delay and Finished

scatter.smooth(Delay,Finished)

the graph is showing the negative correlation.

which independent variables provide the best regression model if two independent variables are in the model ?

=> if we remove Public and finished variable which has less stars then the model on r is,

model1=lm(Delay~Industry+Quality,data=data)
model1
summary(model1)

R2 = 0.2689, Adjusted R-squared = 0.2293

which is less than above model.

Which independent variables provides the best regression model if three independent variables are in the model?

=>if we find model only removing public variable which has no star then the result on r is,

model2=lm(Delay~Industry+Quality+Finished,data=data)# to find anova
model2
summary(model2)

R2= 0.3597, Adjusted R-squared: 0.3063

which is also less than the main model.

for using each independent variable the adjusted R2 is

model=lm(Delay~Industry,data=data)# to find anova
model
summary(model2)

adjusted R2=0.137

model=lm(Delay~Quality,data=data)# to find anova
model
summary(model)

adjusted R2 =0.04

model=lm(Delay~Finished,data=data)# to find anova
model
summary(model)

adjusted R2 = 0.07

from among variable Industry variable has highest adjusted R2 as 13.7%


Related Solutions

A study investigated the relationship between audit delay (Delay), the length of time from a company’s...
A study investigated the relationship between audit delay (Delay), the length of time from a company’s fiscal year‐end to the date of the auditor’s report, and variables that describe the client and the auditor. Some of the independent variables that were included in this study follow: (12 marks total) Industry A dummy variable coded 1 if the firm was an industrial company or if the firm was a bank, savings and loan, or insurance company Public A dummy variable coded...
A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length...
A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length of time from a company's fiscal year-end to the date of the auditor's report, and variables that describe the client and the auditor. Some of the independent variables that were included in this study follow. Industry A dummy variable coded 1 if the firm was an industrial company or 0 if the firm was a bank, savings and loan, or insurance company. Public A...
A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length...
A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length of time from a company's fiscal year-end to the date of the auditor's report, and variables that describe the client and the auditor. Some of the independent variables that were included in this study follow. Industry A dummy variable coded 1 if the firm was an industrial company or 0 if the firm was a bank, savings and loan, or insurance company. Public A...
A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length...
A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length of time from a company's fiscal year-end to the date of the auditor's report, and variables that describe the client and the auditor. Some of the independent variables that were included in this study follow. Industry A dummy variable coded 1 if the firm was an industrial company or 0 if the firm was a bank, savings and loan, or insurance company. Public A...
A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length...
A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length of time from a company's fiscal year-end to the date of the auditor's report, and variables that describe the client and the auditor. Some of the independent variables that were included in this study follow. Industry A dummy variable coded 1 if the firm was an industrial company or 0 if the firm was a bank, savings and loan, or insurance company. Public A...
Checking Your Progress – Correlation & Regression Researchers investigated the relationship between amount of study time...
Checking Your Progress – Correlation & Regression Researchers investigated the relationship between amount of study time statistics class and mid-semester quiz scores. The data appear below: 1 28 95 2 25 95 3 3 58 4 10 75 5 0 44 6 15 83 7 20 91 8 24 87 9 7 65 10 8 70 Find the correlation between hours of study and quiz scores, and test it for significance. Then complete a simple linear regression analysis using hours...
A marketing research experiment was conducted to study the relationship between the length of time necessary...
A marketing research experiment was conducted to study the relationship between the length of time necessary for a buyer to reach a decision (y) and the number of alternative package designs (2, 3, or 4 designs). Remember that s = √MSE. Use the tables below to answer the following questions. Estimate Std. Error t value Pr(>|t|) (Intercept) 3.633 0.865 4.200 0.001 Design 1.700 0.449 3.786 0.001 Observations s R-squared Sxx mean(x) 15 1.468 0.508 10 3 a) Predict the mean...
A study of emergency service facilities investigated the relationship between the number of facilities and the...
A study of emergency service facilities investigated the relationship between the number of facilities and the average distance traveled to provide the emergency service. The following table gives the data collected. Number of Facilities Average Distance (miles) 5 1.57 11 .75 13 .50 18 .35 24 .30 26 .35 Does a simple linear regression model appear to be appropriate? Explain. - No, or Yes; the relationship appears to be - curvilinear or linear c. Develop an estimated regression equation for...
A paper describes a study that investigated the relationship between depression and chocolate consumption. Participants in...
A paper describes a study that investigated the relationship between depression and chocolate consumption. Participants in the study were 931 adults who were not currently taking medication for depression. These participants were screened for depression using a widely used screening test. The participants were then divided into two samples based on the score on the screening test. One sample consisted of people who screened positive for depression, and the other sample consisted of people who did not screen positive for...
A retrospective study conducted in Japan in 1975 investigated the relationship between Smoking and Lung Cancer....
A retrospective study conducted in Japan in 1975 investigated the relationship between Smoking and Lung Cancer. After the study was done, the 2000 people in the sample were classified by whether the had died from Lung Cancer (LC) or not (NLC) and by whether they had been smokers (S) or non-smokers (NS) during their lifetimes. The final table produced from the sample looked like this: S NS TOTAL LC 350 150 500 NLC 624 876 1500 TOTAL 974 1026 2000...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT