In: Statistics and Probability
A statistical program is recommended.
A sample containing years to maturity and yield (%) for 40 corporate bonds is contained in the data file named CorporateBonds.
Company Ticker | Years | Yield |
GE | 1 | 0.767 |
MS | 1 | 1.816 |
WFC | 1.25 | 0.797 |
TOTAL | 1.75 | 1.378 |
TOTAL | 3.25 | 1.748 |
GS | 3.75 | 3.558 |
MS | 4 | 4.413 |
JPM | 4.25 | 2.31 |
C | 4.75 | 3.332 |
RABOBK | 4.75 | 2.805 |
TOTAL | 5 | 2.069 |
MS | 5 | 4.739 |
AXP | 5 | 2.181 |
MTNA | 5 | 4.366 |
BAC | 5 | 3.699 |
VOD | 5 | 1.855 |
SHBASS | 5 | 2.861 |
AIG | 5 | 3.452 |
HCN | 7 | 4.184 |
MS | 9.25 | 5.798 |
GS | 9.25 | 5.365 |
GE | 9.5 | 3.778 |
GS | 9.75 | 5.367 |
C | 9.75 | 4.414 |
BAC | 9.75 | 4.949 |
RABOBK | 9.75 | 4.203 |
WFC | 10 | 3.682 |
TOTAL | 10 | 3.27 |
MTNA | 10 | 6.046 |
LNC | 10 | 4.163 |
FCX | 10 | 4.03 |
NEM | 10 | 3.866 |
PAA | 10.25 | 3.856 |
HSBC | 12 | 4.079 |
GS | 25.5 | 6.913 |
C | 25.75 | 8.204 |
GE | 26 | 5.13 |
GE | 26.75 | 5.138 |
T | 28.5 | 4.93 |
BAC | 29.75 | 5.903 |
(a)
Develop a scatter diagram of the data using x = years to maturity as the independent variable.
a
b
c
d
Does a simple linear regression model appear to be appropriate?
Given the downward trend of the data on the left side of the plot, a linear regression model would predict lower values for the data on the right side of the plot. So, a curvilinear regression model appears to be more appropriate.
Since the data on the left and right sides of the plot both trend upward at about the same rate, a linear model is appropriate.
Since the data on the left and right sides of the plot both trend downward at about the same rate, a linear model is appropriate.
Given the upward trend of the data on the left side of the plot, a linear regression model would predict higher values for the data on the right side of the plot. So, a curvilinear regression model appears to be more appropriate.
(b)
Develop an estimated regression equation with x = years to maturity and x2 as the independent variables. (Round your numerical values to two decimal places.)
ŷ =
(c)
As an alternative to fitting a second-order model, fit a model using the natural logarithm of years to maturity as the independent variable; that is, ŷ = b0 + b1 ln(x). (Round your numerical values to two decimal places.)ŷ =
Does the estimated regression using the natural logarithm of x provide a better fit than the estimated regression developed in part (b)? Explain.
The regression equation developed in part (b) provides a better fit since its R2 value is higher and it predicts that yield will begin to decrease after a certain point with respect to years to maturity.
The regression equation developed in part (c) provides a better fit because it has less influential observations than the equation developed in part (b).
The regression equation developed in part (c) provides a better fit since its R2 value is higher and it predicts that yield will always increase with respect to years to maturity.
The regression equation developed in part (b) provides a better fit since it uses more independent variables than the equation developed in part (c).
Solution
a)
Does a simple linear regression model appear to be appropriate?
Given the upward trend of the data on the left side of the plot, a linear regression model would predict higher values for the data on the right side of the plot. So, a curvilinear regression model appears to be more appropriate.
(b)
Develop an estimated regression equation with x = years to maturity and x2 as the independent variables
we will solve it by using excel and the steps are
Enter the Data into excel
Click on Data tab
Click on Data Analysis
Select Regression
Select input Y Range as Range of dependent variable.
Select Input X Range as Range of independent variable
click on labels if your selecting data with labels
click on ok.
So this is the output of Regression in Excel.
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.82 | |||||||
R Square | 0.67 | |||||||
Adjusted R Square | 0.65 | |||||||
Standard Error | 0.96 | |||||||
Observations | 40.00 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 2.00 | 68.30 | 34.15 | 37.19 | 0.00 | |||
Residual | 37.00 | 33.97 | 0.92 | |||||
Total | 39.00 | 102.28 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 1.02 | 0.44 | 2.34 | 0.03 | 0.13 | 1.90 | 0.13 | 1.90 |
Years | 0.46 | 0.08 | 5.66 | 0.00 | 0.30 | 0.63 | 0.30 | 0.63 |
Years^2 | -0.01 | 0.00 | -3.96 | 0.00 | -0.02 | -0.01 | -0.02 | -0.01 |
Yield = 1.02+0.46*Years-0.01*Years^2
(c)
As an alternative to fitting a second-order model, fit a model using the natural logarithm of years to maturity as the independent variable
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.82 | |||||||
R Square | 0.67 | |||||||
Adjusted R Square | 0.66 | |||||||
Standard Error | 0.94 | |||||||
Observations | 40.00 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 1.00 | 68.48 | 68.48 | 76.98 | 0.00 | |||
Residual | 38.00 | 33.80 | 0.89 | |||||
Total | 39.00 | 102.28 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 0.83 | 0.38 | 2.18 | 0.04 | 0.06 | 1.60 | 0.06 | 1.60 |
log(Years) | 1.56 | 0.18 | 8.77 | 0.00 | 1.20 | 1.92 | 1.20 | 1.92 |
Yield = 0.83+1.56*log(Years)
Does the estimated regression using the natural logarithm of x provide a better fit than the estimated regression developed in part (b)? Explain
The regression equation developed in part (c) provides a better fit because it has less influential observations than the equation developed in part (b).