Question

In: Statistics and Probability

A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length...

A statistical program is recommended.

A study investigated the relationship between audit delay (Delay), the length of time from a company's fiscal year-end to the date of the auditor's report, and variables that describe the client and the auditor. Some of the independent variables that were included in this study follow.

Industry A dummy variable coded 1 if the firm was an industrial company or 0 if the firm was a bank, savings and loan, or insurance company.
Public A dummy variable coded 1 if the company was traded on an organized exchange or over the counter; otherwise coded 0.
Quality A measure of overall quality of internal controls, as judged by the auditor, on a five-point scale ranging from "virtually none" (1) to "excellent" (5).
Finished A measure ranging from 1 to 4, as judged by the auditor, where 1 indicates "all work performed subsequent to year-end" and 4 indicates "most work performed prior to year-end."

A sample of 40 companies provided the following data.

Delay Industry Public Quality Finished
62 0 0 3 1
45 0 1 3 3
54 0 0 2 2
71 0 1 1 2
91 0 0 1 1
62 0 0 4 4
61 0 0 3 2
69 0 1 5 2
80 0 0 1 1
52 0 0 5 3
47 0 0 3 2
65 0 1 2 3
60 0 0 1 3
81 1 0 1 2
73 1 0 2 2
89 1 0 2 1
71 1 0 5 4
76 1 0 2 2
68 1 0 1 2
68 1 0 5 2
86 1 0 2 2
76 1 1 3 1
67 1 0 2 3
57 1 0 4 2
55 1 1 3 2
54 1 0 5 2
69 1 0 3 3
82 1 0 5 1
94 1 0 1 1
74 1 1 5 2
75 1 1 4 3
69 1 0 2 2
71 1 0 4 4
79 1 0 5 2
80 1 0 1 4
91 1 0 4 1
92 1 0 1 4
46 1 1 4 3
72 1 0 5 2
85 1 0 5 1

(a)  Develop the estimated regression equation using all of the independent variables. Use x1 for Industry, x2 for Public, x3 for Quality, and x4 for Finished. (Round your numerical values to two decimal places.)

ŷ =

(c)  Develop a scatter diagram showing Delay as a function of Finished.

On the basis of your observations about the relationship between Delay and Finished, use best-subsets regression to develop an alternative estimated regression equation to the one developed in (a) to explain as much of the variability in Delay as possible. Use x1 for Industry, x2 for Public, x3 for Quality, and x4 for Finished. (Round your numerical values to two decimal places.)

ŷ =

Solutions

Expert Solution

(A)

We have the data:

Delay Industry Public Quality Finished
62 0 0 3 1
45 0 1 3 3
54 0 0 2 2
71 0 1 1 2
91 0 0 1 1
62 0 0 4 4
61 0 0 3 2
69 0 1 5 2
80 0 0 1 1
52 0 0 5 3
47 0 0 3 2
65 0 1 2 3
60 0 0 1 3
81 1 0 1 2
73 1 0 2 2
89 1 0 2 1
71 1 0 5 4
76 1 0 2 2
68 1 0 1 2
68 1 0 5 2
86 1 0 2 2
76 1 1 3 1
67 1 0 2 3
57 1 0 4 2
55 1 1 3 2
54 1 0 5 2
69 1 0 3 3
82 1 0 5 1
94 1 0 1 1
74 1 1 5 2
75 1 1 4 3
69 1 0 2 2
71 1 0 4 4
79 1 0 5 2
80 1 0 1 4
91 1 0 4 1
92 1 0 1 4
46 1 1 4 3
72 1 0 5 2
85 1 0 5 1

EXCEL-->DATA-->DATA ANALYSIS-->REGRESSION;

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.618518551
R Square 0.382565198
Adjusted R Square 0.312001221
Standard Error 10.92351796
Observations 40
ANOVA
df SS MS F Significance F
Regression 4 2587.661436 646.915359 5.421536774 0.001665508
Residual 35 4176.313564 119.3232447
Total 39 6763.975
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 80.42857175 5.91586135 13.59541189 1.57309E-15 68.41873472 92.43840878 68.41873472 92.43840878
X Variable 1 11.94418923 3.797800763 3.145027865 0.003379559 4.234243788 19.65413466 4.234243788 19.65413466
X Variable 2 -4.816257126 4.229181312 -1.138815475 0.262515036 -13.40195164 3.769437385 -13.40195164 3.769437385
X Variable 3 -2.623635035 1.183593557 -2.216668907 0.033238643 -5.026457698 -0.220812372 -5.026457698 -0.220812372
X Variable 4 -4.072510795 1.851430781 -2.199655982 0.034527054 -7.831115103 -0.313906486 -7.831115103 -0.313906486

Y=80.42857175+11.94418923 X1 -4816257126 X2- 2.62365035 X3-4.072510795 X4

R2=0.618518551 hence there is 0.618518551 or 61.85% variability in Y due to x1 for Industry, x2 for Public, x3 for Quality, and x4 for Finished

(C)

We have:

1 62
3 45
2 54
2 71
1 91
4 62
2 61
2 69
1 80
3 52
2 47
3 65
3 60
2 81
2 73
1 89
4 71
2 76
2 68
2 68
2 86
1 76
3 67
2 57
2 55
2 54
3 69
1 82
1 94
2 74
3 75
2 69
4 71
2 79
4 80
1 91
4 92
3 46
2 72
1 85

HENCE, Y=-4.3824X+80.226

this means that for 1 unit increase in FINISHED there is 4.3824 units decrease in delay.

R2=0.0993 hence there is 0.0993 or 9.93% variability in DELAY due to FINISHED.

please rate my answer and comment for doubts.


Related Solutions

A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length...
A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length of time from a company's fiscal year-end to the date of the auditor's report, and variables that describe the client and the auditor. Some of the independent variables that were included in this study follow. Industry A dummy variable coded 1 if the firm was an industrial company or 0 if the firm was a bank, savings and loan, or insurance company. Public A...
A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length...
A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length of time from a company's fiscal year-end to the date of the auditor's report, and variables that describe the client and the auditor. Some of the independent variables that were included in this study follow. Industry A dummy variable coded 1 if the firm was an industrial company or 0 if the firm was a bank, savings and loan, or insurance company. Public A...
A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length...
A statistical program is recommended. A study investigated the relationship between audit delay (Delay), the length of time from a company's fiscal year-end to the date of the auditor's report, and variables that describe the client and the auditor. Some of the independent variables that were included in this study follow. Industry A dummy variable coded 1 if the firm was an industrial company or 0 if the firm was a bank, savings and loan, or insurance company. Public A...
A study investigated the relationship between audit delay (Delay), the length of time from a company's...
A study investigated the relationship between audit delay (Delay), the length of time from a company's fiscal year-end to the date of the auditor's report, and variables that describe the client and the auditor. The independent variables are as follows. Industry A dummy variable coded 1 if the firm was an industrial company or 0 if the firm was a bank, savings and loan, or insurance company. Public A dummy variable coded 1 if the company was traded on an...
A study investigated the relationship between audit delay (Delay), the length of time from a company’s...
A study investigated the relationship between audit delay (Delay), the length of time from a company’s fiscal year‐end to the date of the auditor’s report, and variables that describe the client and the auditor. Some of the independent variables that were included in this study follow: (12 marks total) Industry A dummy variable coded 1 if the firm was an industrial company or if the firm was a bank, savings and loan, or insurance company Public A dummy variable coded...
A statistical program is recommended. A highway department is studying the relationship between traffic flow and...
A statistical program is recommended. A highway department is studying the relationship between traffic flow and speed. The following model has been hypothesized: y = β0 + β1x + ε where y = traffic flow in vehicles per hour x = vehicle speed in miles per hour. The following data were collected during rush hour for six highways leading out of the city. Traffic Flow (y) Vehicle Speed (x) 1,257 35 1,331 40 1,225 30 1,337 45 1,349 50 1,126...
A statistical program is recommended. A highway department is studying the relationship between traffic flow and...
A statistical program is recommended. A highway department is studying the relationship between traffic flow and speed. The following model has been hypothesized: y = β0 + β1x + ε where y = traffic flow in vehicles per hour x = vehicle speed in miles per hour. The following data were collected during rush hour for six highways leading out of the city. Traffic Flow (y) Vehicle Speed (x) 1,256 35 1,328 40 1,228 30 1,337 45 1,349 50 1,122...
A statistical program is recommended. A marketing professor at Givens College is interested in the relationship...
A statistical program is recommended. A marketing professor at Givens College is interested in the relationship between hours spent studying and total points earned in a course. Data collected on 10 students who took the course last quarter follow. Hours Spent Studying Total Points Earned 45 40 30 35 90 75 60 65 105 90 65 50 90 90 80 80 55 45 75 65 (a) Develop an estimated regression equation showing how total points earned can be predicted from...
A study of emergency service facilities investigated the relationship between the number of facilities and the...
A study of emergency service facilities investigated the relationship between the number of facilities and the average distance traveled to provide the emergency service. The following table gives the data collected. Number of Facilities Average Distance (miles) 5 1.57 11 .75 13 .50 18 .35 24 .30 26 .35 Does a simple linear regression model appear to be appropriate? Explain. - No, or Yes; the relationship appears to be - curvilinear or linear c. Develop an estimated regression equation for...
A paper describes a study that investigated the relationship between depression and chocolate consumption. Participants in...
A paper describes a study that investigated the relationship between depression and chocolate consumption. Participants in the study were 931 adults who were not currently taking medication for depression. These participants were screened for depression using a widely used screening test. The participants were then divided into two samples based on the score on the screening test. One sample consisted of people who screened positive for depression, and the other sample consisted of people who did not screen positive for...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT