Question

In: Statistics and Probability

The Carseat is a data set containing sales of child car seats at 400 different stores....

The Carseat is a data set containing sales of child car seats at 400 different stores. Your task is to develop a regression model to predict sales and use this data to estimate your model.

Description of variables:
Sales: Unit sales (in thousands) at each location
CompPrice: Price charged by competitor at each location
Income: Community income level (in thousands of dollars)
Advertising: Local advertising budget for company at each location (in thousands of dollars)
Population: Population size in region (in thousands)
Price: Price company charges for car seats at each site
ShelveLoc: A factor with levels Bad, Good and Medium indicating the quality of the shelving location for the car seats at each site
Age: Average age of the local population
Education: Education level at each location
Urban: A factor with levels No and Yes to indicate whether the store is in an urban or rural location
US: A factor with levels No and Yes to indicate whether the store is in the US or not

a. Which variables are categorical variable? For each categorical variable, define the corresponding dummy variables. [1pt]


b. Develop a correlation matrix between all variables. Copy your correlation matrix here. [1pt]


c. Develop a regression model to predict sales and use all the data to estimate your model. Write the estimated equation here. What factors are significant predictors of sales? [1pt]


d. Use the first 300 rows of the data as training set to estimate your model and use the remaining 100 rows as test set to evaluate the out-of-sample prediction performance of your model. Use rMSE as the performance measure. What is the rMSE for your in-sample and out-of-sample prediction? [1pt]


Solutions

Expert Solution

The detailed Rcode and explanation given below


Related Solutions

The Carseat is a data set containing sales of child car seats at 400 different stores....
The Carseat is a data set containing sales of child car seats at 400 different stores. Your task is to develop a regression model to predict sales and use this data to estimate your model. Description of variables: Sales: Unit sales (in thousands) at each location CompPrice: Price charged by competitor at each location Income: Community income level (in thousands of dollars) Advertising: Local advertising budget for company at each location (in thousands of dollars) Population: Population size in region...
managirial accounting Julia Child Company sells baby car seats to retailers for an average of $90...
managirial accounting Julia Child Company sells baby car seats to retailers for an average of $90 each. The variable cost of each car seat is $51 and monthly fixed selling costs total $8,000. Other monthly fixed costs of the company total $7,500 Required: 7. What is the breakeven point in number of car seats? 8. What is the margin of safety, assuming sales total $38,000? 9. What is the breakeven level in car seats, assuming variable costs increase by 20%?...
You are given the following table of output pertaining to T-Shirt sales data from different stores...
You are given the following table of output pertaining to T-Shirt sales data from different stores in Pennsylvania. The variable shirts is the number of T-Shirts sold per month in 100s, and price is the price in dollars of the T-shirt. Table 1: OLS regression Price and T-Shirts sold Dependent variable: shirts price -0.40∗∗∗ (0.10) Constant 7.00∗∗∗ (0.06) Observations 1120 R2 0.20 Adjusted R2 0.20 Residual Std. Error 7.26 (df = 998) F Statistic 12.15∗∗∗ (df = 1; 998) Note:...
A data set that X that is normally distributed with µ = 400 and a σ...
A data set that X that is normally distributed with µ = 400 and a σ = 20. (Show Work) a. What is P(370 < X < 410) = _________ b. Determine the value of X such that 70% of the values are greater than that value. _____
A data set that X that is normally distributed with µ = 400 and a σ...
A data set that X that is normally distributed with µ = 400 and a σ = 20.       a.  What is P(370 < X < 410) = _________     b. Determine the value of X such that 70% of the values are greater than that value. _____
A data set that X that is normally distributed with µ = 400 and a σ...
A data set that X that is normally distributed with µ = 400 and a σ = 20.     a.  What is P(370 < X < 410) = _________     b. Determine the value of X such that 60% of the values are greater than that value. _____ PLEASE SHOW ALL YOUR WORK
The data set 2017NBADraft.txt is a comma delimited file containing data on the players selected in...
The data set 2017NBADraft.txt is a comma delimited file containing data on the players selected in the 2017 NBA draft. Assume this data set is a representative sample of all players in the NBA. Correctly import the data set into R. c) In R, perform a hypothesis test to determine if the mean weights are different for the first round selections and second round selections. To do this, create a new variable so that the first 30 players all have...
A Sales Manager at WATWIUC Stores collected data on annual sales (in thousand Ghana Cedis) and...
A Sales Manager at WATWIUC Stores collected data on annual sales (in thousand Ghana Cedis) and years of experience of 30 sales persons. He wants to explore the relationship between years of experience and annual sales. a) Which variable is the dependent variable in this study? b) Which variable is the independent variable in this study? c) The coefficient of correlation between the dependent variable and the independent variable was found to be 0.9, meaning that the dependent and the...
use any data and answers those questions There are several stores that or businesses that set...
use any data and answers those questions There are several stores that or businesses that set these types of goals. Would this tell us an overall average? For example lets just say the sales goal is 100,000 a month, this month 98,000 is sold that is clearly less than the goal. However is that what the test tells us? Or does it look at a long term average over several months? Why is it valuable to know if the results...
Below is a data set containing information on applicants applying for automobile insurance with an insurance...
Below is a data set containing information on applicants applying for automobile insurance with an insurance company. Here is a description of each variable: Conduct a test to see if the population average annual Claims Paid is higher for people who are Not Married thant it is for people who are Married. Use alpha = 0.05. State hypotheses. Explain your results. Assume that the population standard deviations are not equal. App. No. Marital Status Education Annual Salary Credit Score Age...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT