** Use R for the following analysis. Use the BoneAcid.xlsx data to check what is causing...

** Use R for the following analysis.

Use the BoneAcid.xlsx data to check what is causing the variation in the acid content in bones among 42 male skeletons from 2 cemeteries. The independent variables included are internment lengths, ages, depths, lime addition and contamination in soil.

Variables/Columns

Burial Site   (1 or 2)

Internment Time (Years)

Burial Depth (feet)

LimeAdded (at internment) (1=Yes, 0=No)

Death_Age (Age of Person at the time of death)

Acid Level (g/100g of bone)

Contamination (In soil) (1=Yes, 0=No)

1. Undertake appropriate basic data analytics to motivate the regression model Use dummy variables for each of Burial Site, LimeAdded, and Contamination (If required create the dummy-variables for each).

2. Do you suspect any multicollinearity problem could affect the regression coefficients?

3. Run a regression model of the Acid Level on all independent variables provided and interpret all regression coefficients.

4. Briefly describe what you need to do before conducting any hypothesis testing when you find evidence of heteroscedasticity in an OLS regression model? Test for heteroscedasticity to check for evidence of heteroscedasticity in part 3

5. Test the hypothesis that

i. Beta_InternmentTime < -0.00675

ii. Jointly Beta_BurialSite = Beta_BurialDepth =Beta_LimeAdded=0

6, What is the best model specification that would explain acid content in bones better?

Burial Site	InternmentTime	Baurial Depth	LimeAdded	Death_Age	Contamination	Acid Level
1	88.5	7	1	34	1	3.88
1	88.5	7	1	38	1	4
1	85.2	7	1	27	1	3.69
1	71.8	7.6	1	26	0	3.88
1	70.6	7.5	1	42	0	3.53
1	68	7	1	28	0	3.93
1	71.6	8	1	35	0	3.88
1	70.2	6	1	44	0	3.64
1	55.5	6	0	29	0	3.97
1	36.5	6.5	0	29	0	3.85
1	36.3	6.5	0	48	0	3.96
1	46.5	6.5	0	35	0	3.69
1	35.9	6.5	0	40	0	3.76
1	45.5	6.5	0	34	0	3.75
1	43	6.5	0	38	0	3.75
1	44.9	6.5	0	27	0	3.92
1	59.5	8	0	26	0	3.76
1	58.3	8	0	23	0	3.93
1	56.5	8	0	35	0	3.7
1	56.3	8	0	23	0	3.82
1	43	6.5	0	40	0	3.78
1	42.5	9	0	31	0	4
1	29	7.5	0	31	0	3.92
1	35.3	8.5	0	39	0	3.79
2	93.6	4	1	39	0	3.49
2	90	4	1	43	0	3.57
2	88	5.5	1	26	0	3.43
2	84.4	5	1	47	0	3.55
2	84	4.75	1	39	0	3.5
2	79.7	4.75	1	27	0	3.27
2	67.4	4.5	1	39	0	3.66
2	64.7	5	1	27	0	3.9
2	64.7	5.5	1	35	1	3.91
2	38.3	7	0	21	0	3.73
2	59.6	9.25	0	46	0	3.72
2	32	9	0	24	0	3.85
2	32.2	9	0	27	0	3.85
2	26.5	7	0	34	0	4.06
2	34.7	8.5	0	30	0	4.04
2	27.6	6	0	22	0	4
2	35.7	9	0	19	0	3.93
2	49.6	9	0	50	0	3.85

Expert Solution

orchestra answered 3 years ago

Question #11 – Regression Analysis Use the data provided to: Perform the “Tests to Check the...

Question #11 – Regression Analysis Use the data provided to: Perform the “Tests to Check the Validity of a Regression” Show both the calculated and critical values Estimate Y when X = 4 (round to 2 decimal places) Use a level of significance of 5% (α = .05). Clearly show the null and alternate hypothesis. Graphs are not required. X Y 3 14 7 26 6 23 4 17 7 28 5 20 8 29 2 11

***Use R/STATA to perform the following analysis Data: ShareValue.xlsx contains data on N=309 firms which sold...

***Use R/STATA to perform the following analysis Data: ShareValue.xlsx contains data on N=309 firms which sold new shares. Data on the following variables is provided. All variables are measured in millions of US dollars. ShrVal is the dependent variable and the all the remaining variables are the explanatory variables. ShareValue: the total value of all shares outstanding, calculated as the price per share times the number of shares outstanding. FirmDebt: firm’s long-term debt TotalSales: sales of the firm. Net_Income: net...

Given the following data, what is the R squared value?(Hint: use the video in Part VI...

Given the following data, what is the R squared value?(Hint: use the video in Part VI on "simple linear regression" to aid you) Height Shoe Size Age Weight 72 9 32 120 65 12 49 180 50 10 18 95 70 10 27 145 55 11 17 170 62 12 44 155 a. 95.7% b. Not enough information c. 91.6% d. 79.1%

pl use r studio to do that What is the most appropriate analysis to perform on...

pl use r studio to do that What is the most appropriate analysis to perform on the following data? x<-c(8.1, 9.4, 9.9, 9.6, 10.7, 10.2, 10.4, 13.6, 15.5, 17.8) Y<-c(7.3, 8.6, 9.9, 9.6, 9.3, 9.2, 10.9, 10.7, 11.4, 16.1) Determine Spearman’s Rho coefficient (2dp) for the following data. x<-c(56,56,65,65,50,25,87,44,35) y<-c(87,91,85,91,75,28,122,66,58)

sleep data analysis: a) What is the dataset “sleep” in R? and its description? b) Draw...

sleep data analysis: a) What is the dataset “sleep” in R? and its description? b) Draw boxplots for two drug groups in ONE plot. c) Set up a hypothesis (null and alternative) for testing whether there exists an effect difference between two drugs. Using both words and symbols for hypothesis settings. (Hint: this is a paired sample test, rather than a general two sample t test.) d) Use an appropriate formula to calculate the test statistic and find its p-value...

Data Analysis & Visualization Topic R vector and save the r code in a text file...

Data Analysis & Visualization Topic R vector and save the r code in a text file Problem 1. Create two vectors named v and w with the following contents: v : 21,10,32,2,-3,4,5,6,7,4,-22 w : -18,72,11,-9,10,2,34,-5,18,9,2 A) Print the length of the vectors B) Print all elements of the vectors C) Print elements at indices 3 through 7. D) Print the sum of the elements in each vector. E) Find the mean of each vector. (Use R's mean() function)...

Use SPSS® to check your mock data for the following: • Assumptions of normality (Shapiro-Wilk) •...

Use SPSS® to check your mock data for the following: • Assumptions of normality (Shapiro-Wilk) • Homogeneity of variance (Lavene) • Outliers • Skewness/Kurtosis Complete each of the associated tasks below: 1) Create a series of tables that depict your results. Do not simply paste your output from SPSS®; 0.00 21.00 0.00 21.00 0.00 42.00 0.00 18.00 0.00 15.00 0.00 24.00 0.00 36.00 0.00 36.00 0.00 18.00 0.00 24.00 0.00 30.00 0.00 39.00 0.00 21.00 0.00 15.00 0.00 51.00 0.00...

In R, Use library(MASS) to access the data sets for this test. Use the Pima.tr data...

In R, Use library(MASS) to access the data sets for this test. Use the Pima.tr data set to answer questions 1-5. What is the average age for women in this data set? What is the maximum number of pregnancies for women in this data set ? What is the median age for women who have diabetes? What is the median age for women who do not have diabetes? What is the third quartile of the skin variable?

The following data represent the results of an analysis of the proportion of people who use...

The following data represent the results of an analysis of the proportion of people who use Master Card during a typical shopping trip in three different cities. The sample size was 500 in each city: City A: 70% City B: 60% City C: 65% A Chi-Square analysis rejected the hypothesis that the proportions in the populations of each city were the same. The calculated Chi-Square value was 10.73 and the value from the Chi-Square table was 5.991. Is there a...

Cyclic Redundancy Check Given the data bits D = 110100101011, the generator G = 110011, r...

Cyclic Redundancy Check Given the data bits D = 110100101011, the generator G = 110011, r = 5. 1) Find the CRC. Give the detailed steps of your computation. 2) What does the sender send? 3) Show how the receiver verifies the received data. Assume there is no error. (mostly need help with this step!)

Question