Question

In: Statistics and Probability

4) We're going to test the same hypothesis four ways. Assume the people in the dataset...

4) We're going to test the same hypothesis four ways. Assume the people in the dataset in armspanSpring2020.csv are a random sample of all adults. For each test, report the test statistic and the p-value. With a 5% significance level, give the conclusion of each test.
a) Test the hypothesis that the mean difference between armspan and height it not equal to 0, using the data in armspanSpring2020.csv. Do this by creating a new variable named diff = (armspan - height). Perform a one-sample t-test.
b) Test the same hypothesis, but use a two-sample t-test with paired =TRUE.
c) Test the same hypothesis, but use a two-sample t-test with paired=FALSE and var.equal=FALSE.
d) Test the same hypothesis, but use a two-sample t-test with paired=FALSE and var.equal=TRUE.
e) Which test(s) do you think are valid for this situation and why?
hint: We almost never use the var.equal=TRUE test. Why? Because it is only valid if the population standard deviations of both populations are equal. You might be in a situation where you know this to be true. If so, fine, use it. But usually we don't, in which case (a) the var.equal=FALSE test will provide more accurate p-values if the standard deviations are not equal and (b) will provide pretty accurate p-value if they are. So you can't lose, really, with the var.equal=FALSE test, but you can lose with it the other way.

g)Data cleaning. Identify by row number which observations seem in need of cleaning and why you think so. Provide a table. (Hint: consider the "which()" and "identify()" functions.) Provide a graph to justify your identifications.

height	armspan	is.female
67	NA	1
70	40	0
64	67	1
71	70	0
72	49	0
62	61	1
72	74	0
71	68	0
63	60	1
69	69	0
67	68	1
63	63	1
60	60	1
66	66	0
61	61	1
69	68	0
65	65	1
72	72	0
70	70	0
73	77	0
65	61	1
68	72	1
62	55	NA
71	74	0
72	70	0
66	22	1
65	67	1
64	62	0
65	62	1
73	69	0
67	77	0
60	62	1
70	59	0
68	66	1
65	65	1
72	69	0
62	52	1
69	66	0
68	67	0
65	66	1
65	64	0
66	65	1
62	52	1
64	62	1
66	65	1
69	69	0
64	65	1
70	74	0
65	69	0
70	80	0
63	NA	1
67	70	1
64	64	1
64	62	1
6	5.7	0
67	67	1
72	71	0
73	75	0
68	68	0
67	63	1
66	67	1
67	36	0
68	72	0
73	70	0
70	70	0
70	72	0
60	58	0
70	68	0
62	63	0
68	68	1
67	67	NA
68	71	0
65	48	1
70	76	0
69	70	0
69	66	0
58	55	NA
64	64	0

Please help with the r codes. It is my first time doing r studio and I'm having a hard time. Thanks!

Expert Solution

We are given a set of data of height and armspan.

The size of data is greater than 30

So t distribution converges to normal or z distribution.

a)

Let us define

diff = armspan - height

We have to test one sample t test.

Therefore, to test

against

test statistics is given by,

where, is mean of diff

is variance of diff

and n is size of diff.

from this t p value is calculated by any table or software.

Reject H0 if p value < 0.05

b)

In paired t test, the procedure is same as we have done above

c)

In two sample t test, the procedure is as follows.

Ler be the mean of heights,

be the mean of armspan.

be the variance of heights

be the variance of armspans

Here, i.e. sample size is same for both sample.

To test, against

The test statics is given by,

The p value is determined by same procedure above.

d)

When the variances are same, the test statics converts to

Where,

preceding procedure is same as above.

e)

In my view, if the height and the armspan given is of same person, then paired t tset is best method to verify the hypothesis.

It is because, for a person, height and the armspan are related to each other, that is the correlation between height and armspan is highly positive.

Note:

While doing on R software,, if you are working on basic version, you have to use arithmatic operations to calculate test static.

If you want to compute directly, you should have to download add on packages. like ggplot2 and others.

orchestra answered 1 year ago

For this assignment we're going to make another calculator. This one will be a simple four...

For this assignment we're going to make another calculator. This one will be a simple four function (add, subtract, multiply, and divide) calculator, but it will have state. Specifically, the calculator will keep track of the result of the most recent operation and use that value as the first operand for the next operation. Take a look at the sample output below if this doesn't quite make sense. Your new calculator class should have the following fields and methods: fields:...

n the instructions for this Task C, we're going to assume that you have completed your...

n the instructions for this Task C, we're going to assume that you have completed your script for Task B above; i.e. that you have a working dictionary-based version of the functions (originally named addToRecord(), etc. in Task A) now named addToHistogram(), etc. Let's assume that your dictionary-based function (from Task B above) that creates a histogram is named makeHistogram(). You are going to use your makeHistogram() in the following sub-tasks. In fact, you're welcome to include any function you...

#4. Four roommates are going their separate ways after graduation and wish to divide up their...

#4. Four roommates are going their separate ways after graduation and wish to divide up their jointly owned furniture (equal shares) and the moving chores by the method of sealed bids Their bids (in dollars) on the items are shown in the following table. Audrey Billy Carl David Stereo 300 250 200 280 Couch 200 350 300 100 Table 250 200 240 80 Desk 150 150 200 220 Cleaning the rugs - 80 - 70 - 100 - 60 Patching...

What are the four ways to test internal Controls?

Solve using R. You will need library(resampledata) and the dataset FlightDelays. Conduct a hypothesis test to...

Solve using R. You will need library(resampledata) and the dataset FlightDelays. Conduct a hypothesis test to see whether there is a difference in the variances of flight delay length between the two airlines. 1) Set a hypothesis for this test using appropriate notation 2) Using R, find the value of the observed test statistic 3) Using R, compute the P-value of the observed test statistic using a permutation distribution with N=10^5-1 resamples. If possible use comments so that it is...

Assume a test of the null hypothesis that requires a t-test (rather than a z-test). Sample...

Assume a test of the null hypothesis that requires a t-test (rather than a z-test). Sample size is 25 and alpha is .05. Which of the following would be appropriate critical t values for a 2-tailed test and a one-tailed test, respectively, of the null hypothesis? Question 21 options: a) +2.064, + 1.711 b) +2.064, +1.711 c) +2.064, +1.711 d) +2.064, + or - 1.711

In the following research, identify 1. test type, 2. null hypothesis, 3. alternate hypothesis, 4. test...

In the following research, identify 1. test type, 2. null hypothesis, 3. alternate hypothesis, 4. test statistic, 5. p-value, and 6. conclusion. Researchers examined the effect of phosphate supplementation on bone formation in six healthy adult dogs. For each dog, bone formation was measured twice: once after 12 weeks of phosphate supplementation and once after a 12-week control period. The results in percent bone growth are shown below. Do the data provide evidence that phosphate supplementation significantly stimulates bone formation?...

Chapter 11: Using the attached dataset “Chapter 11 Data Set 1” to test the research hypothesis...

Chapter 11: Using the attached dataset “Chapter 11 Data Set 1” to test the research hypothesis that males raise their hands in class more often than females. (Please note that these data are not exactly the data provided by the author). a. Is this a directional or non-directional hypothesis? b. Should you use a one-tailed or two-tailed test? c. What is the corresponding null hypothesis? d. Conduct the between groups t-test using Excel (either method). Use the .05 confidence level....

We are now going to use Excel to test a hypothesis based on one given sample....

We are now going to use Excel to test a hypothesis based on one given sample. We would like to know if the average cholesterol level of patients in intensive care is equal to 200 and for that reason we collect cholesterol level of 20 random people from various intensive care units. Cholesterol level 154,168,134,201,208,220,225,228,201,207,168,211,203,254,268,198,298,135,154,189 We put our data in column A. In Cells A3-A22 we type our data and in cell A1 we type Cholesterol. First let’s input the...

Test the hypothesis that the people will spend less than an average of $125 on gifts....

Test the hypothesis that the people will spend less than an average of $125 on gifts. A random sample of 18 people spent an average of 150 dollars with a standard deviation of $10. Set alpha = 0.01 and use the p-value approach. T0.0115 = 2.5