TWO MEANS – INDEPENDENT SAMPLES
Choose a variable from the advising.sav data set to compare group means. While the choice of which variable to test is up to you, you must remember that it must be a metric variable. The grouping variable, which is used to define the two groups to be compared, must be categorical. You can look in the “Measure” column of the “Variable View” in the data file for help in determining which is which. The managerial question is whether or not there is a significant difference between the groups for the metric variable you have chosen.
Once you have the results, report your findings using the five step hypothesis testing procedure outlined in class. (See below.) For Step 4, simply cut and paste the SPSS output into the report. This can be done by clicking on the desired portion of the output which will then be highlighted, and then right clicking on the highlighted portion and copying it to your flash drive. (Note that you may want to drop the results into a word document immediately since if you do not have SPSS on your personal laptop, you will not be able to open any SPSS output.) Then state the answer to the managerial question that was initially posed. For example, is there a significant difference between the two groups defined by the grouping variable (which you must identify in your report) for the metric variable tested? Also, interpret the confidence interval provided for the test. Does it indicate a significant difference or not?
PAIRED SAMPLE T-TEST
Choose a pair of metric variables and run a paired sample t-test on the pair. Again, these must be metric variables. The managerial question will be “Is there a significant difference between the two variables?” for the pair. Report your findings using the same procedure described above, including an interpretation of the confidence interval.
REPORT(SAMPLE)
Your report will consist of two hypotheses tests, (one for the independent sample test and one for the paired sample test). It will look something like this (for the independent sample test):
1: H0: μ1= μ2
Ha: μ1 ≠ μ2
2: Two group independent sample t-test (note that SPSS does everything as a t-test regardless of sample size).
3: α=.05 → tcrit = ±whatever the appropriate value is
4
Group Statistics |
|||||
status |
N |
Mean |
Std. Deviation |
Std. Error Mean |
|
dotest |
0 |
185 |
1494.071 |
2249.4948 |
165.3861 |
1 |
50 |
803.280 |
1080.0304 |
152.7394 |
Independent Samples Test |
||||||||||
Levene's Test for Equality of Variances |
t-test for Equality of Means |
|||||||||
F |
Sig. |
t |
df |
Sig. (2-tailed) |
Mean Difference |
Std. Error Difference |
95% Confidence Interval of the Difference |
|||
Lower |
Upper |
|||||||||
dotest |
Equal variances assumed |
13.465 |
.000 |
2.104 |
233 |
.036 |
690.7914 |
328.2585 |
44.0572 |
1337.5255 |
Equal variances not assumed |
3.068 |
169.287 |
.003 |
690.7914 |
225.1264 |
246.3747 |
1135.2080 |
5: Make a decision regarding the null hypothesis and interpret the confidence interval.
6: Answer the managerial question.
TWO RESULTS AFTER RUNNING
INDEPENDENT
Group Statistics |
|||||
Gender |
N |
Mean |
Std. Deviation |
Std. Error Mean |
|
OverallSatisfaction |
Female |
131 |
4.97 |
1.771 |
.155 |
Male |
145 |
4.99 |
1.488 |
.124 |
Independent Samples Test |
||||||||||
Levene's Test for Equality of Variances |
t-test for Equality of Means |
|||||||||
F |
Sig. |
t |
df |
Sig. (2-tailed) |
Mean Difference |
Std. Error Difference |
95% Confidence Interval of the Difference |
|||
Lower |
Upper |
|||||||||
OverallSatisfaction |
Equal variances assumed |
5.905 |
.016 |
-.120 |
274 |
.904 |
-.024 |
.196 |
-.410 |
.363 |
Equal variances not assumed |
-.119 |
255.054 |
.905 |
-.024 |
.198 |
-.414 |
.366 |
PAIRED
Paired Samples Statistics |
|||||
Mean |
N |
Std. Deviation |
Std. Error Mean |
||
Pair 1 |
DesiredConvenience |
6.20 |
273 |
1.175 |
.071 |
ActualConvenience |
4.55 |
273 |
1.636 |
.099 |
Paired Samples Correlations |
||||
N |
Correlation |
Sig. |
||
Pair 1 |
DesiredConvenience & ActualConvenience |
273 |
.213 |
.000 |
Paired Samples Test |
|||||||||
Paired Differences |
t |
df |
Sig. (2-tailed) |
||||||
Mean |
Std. Deviation |
Std. Error Mean |
95% Confidence Interval of the Difference |
||||||
Lower |
Upper |
||||||||
Pair 1 |
DesiredConvenience - ActualConvenience |
1.648 |
1.799 |
.109 |
1.434 |
1.863 |
15.140 |
272 |
.000 |
PLEASE ANSWER TWO REPORTS INDEPENDENTLY ONE IS INDEPENDENT AND THE OTHER IS PAIRED
AND PLEASE ANSWER AS SAMPLE REPORT STRUCTURE WITH 6 STEPS
In: Statistics and Probability
A survey found that women's heights are normally distributed with mean
63.8
in and standard deviation
2.4
in. A branch of the military requires women's heights to be between 58 in and 80 in.
a. Find the percentage of women meeting the height requirement. Are many women being denied the opportunity to join this branch of the military because they are too short or too tall?
b. If this branch of the military changes the height requirements so that all women are eligible except the shortest 1% and the tallest 2%, what are the new heightrequirements?
In: Statistics and Probability
Monty hall Problem
Explain the statistical probabilities associated with the game show
In: Statistics and Probability
A researcher wishes to estimate the proportion of adults who have high-speed Internet access. What size sample should be obtained if she wishes the estimate to be within 0.04 with
95% confidence if
(a) she uses a previous estimate of 0.28?
(b) she does not use any prior estimates?
In: Statistics and Probability
The tread life (x) of tires follow normal distribution with µ = 60,000 and σ= 6000 miles. The manufacturer guarantees the tread life for the first 52,000 miles. (i) What proportion of tires last at least 55,000 miles? (ii) What proportion of the tires will need to be replaced under warranty? (iii) If you buy 36 tires, what is the probability that the average life of your 36 tires will exceed 61,000? (iv) The manufacturer is willing to replace only 3% of its tires under a warranty program involving tread life. Find the tread life covered under the warranty.
In: Statistics and Probability
In an effort to reduce energy costs, a major university has installed more efficient lights as well as automatic sensors that turn the lights off when no movement is present in a room. Historically, the cost of lighting an average classroom for 1 week has been $265. To determine whether the changes have signficantly reduced costs, the university takes a sample of 50 classrooms. They find that the average cost for 1 week is $247 with a standard deviation of $60. When testing the hypothesis (at the 5% level of significance) that the average energy use has decreased from the past, what is the test statistic? (please round your answer to 2 decimal places)
In: Statistics and Probability
You bet your friend you can make 3 free throws. If you do it you get 10 dollars, if you don’t you lose 10 dollars. The probability that you make any given free throw is 60%. What is the expected value of this game?
What is the probability that you make 6 out of 10 free throws?
In: Statistics and Probability
One way to understand the strength of the correlation coefficient is to take the absolute value and then see how close it is to 1. The reason that makes it easier for some people to understand is that they get confused by the negative sign. The closer it is to -1 or +1, the stronger the correlation, as it doesn't consider the direction of the correlation. However, many students get confused by the negative aspect and assume that a positive correlation coefficient is stronger. Therefore, if you refer to the absolute value, that will eliminate any confusion regarding the negative or positive sign. On the topic of a representative sample, it's debatable how many data points you actually need. At what point do you think it's necessary to look at every data point, compared to a representative sample? When would you feel comfortable that your sample truly represents the population?
In: Statistics and Probability
A) Mean sales per week exceeds 41.5 per salesperson |
|||||||
. Ho: mu=41.5 |
|||||||
HA: mu.>41.5 bar over x=51.02 |
|||||||
n=100 mu0=41.5 |
|||||||
Square root of 100=10 |
|||||||
t=51.02-41.5 / 6.8 square root 100 51.02-41.5/6.8 / square root 100 CI =( 50.17, 52.83) Bar over X +/- t. s/sq.rt of n Excel command=T.DIST.RT(t-1) p-value close to 0 Bar over x,t=1.96,S=6.8,N=100 51.5 +/- (1.96) times (6.8 / sq.rt. 100 )=1.3328 (50.1672, 52.8328) 95 % C.I. FOUND THE INFORMATION ABOVE WHAT IS THE F-VALUE |
In: Statistics and Probability
PROJECT 3 INSTRUCTIONS Based on Brase & Brase: sections 6.1-6.3 Visit the NASDAQ historical prices weblink. First, set the date range to be for exactly 1 year ending on the Monday that this course started. For example, if the current term started on April 1, 2018, then use April 1, 2017 – March 31, 2018. (Do NOT use these dates. Use the dates that match up with the current term - MY COURSE STARTED ON JANUARY 14, 2018.) Do this by clicking on the blue dates after “Time Period”. Next, click the “Apply” button. Next, click the link on the right side of the page that says “Download Data” to save the file to your computer. This project will only use the Close values. Assume that the closing prices of the stock form a normally distributed data set. This means that you need to use Excel to find the mean and standard deviation. Then, use those numbers and the methods you learned in sections 6.1-6.3 of the course textbook for normal distributions to answer the questions. Do NOT count the number of data points. Complete this portion of the assignment within a single Excel file. Show your work or explain how you obtained each of your answers. Answers with no work and no explanation will receive no credit. 1. a) Submit a copy of your dataset along with a file that contains your answers to all of the following questions. b) What the mean and Standard Deviation (SD) of the Close column in your data set? c) If a person bought 1 share of Google stock within the last year, what is the probability that the stock on that day closed at less than the mean for that year? Hint: You do not want to calculate the mean to answer this one. The probability would be the same for any normal distribution. (5 points) 2. If a person bought 1 share of Google stock within the last year, what is the probability that the stock on that day closed at more than $950? (5 points) 3. If a person bought 1 share of Google stock within the last year, what is the probability that the stock on that day closed within $50 of the mean for that year? (between 50 below and 50 above the mean) (5 points) 4. If a person bought 1 share of Google stock within the last year, what is the probability that the stock on that day closed at less than $800 per share. Would this be considered unusal? Use the definition of unusual from the course textbook that is measured as a number of standard deviations (5 points) 5. At what prices would Google have to close in order for it to be considered statistically unusual? You will have a low and high value. Use the definition of unusual from the course textbook that is measured as a number of standard deviations. (5 points) 6. What are Quartile 1, Quartile 2, and Quartile 3 in this data set? Use Excel to find these values. This is the only question that you must answer without using anything about the normal distribution. (5 points) 7. Is the normality assumption that was made at the beginning valid? Why or why not? Hint: Does this distribution have the properties of a normal distribution as described in the course textbook? Real data sets are never perfect, however, it should be close. One option would be to construct a histogram like you did in Project 1 to see if it has the right shape. Something in the range of 10 to 12 classes is a good number. (5 points) There are also 5 points for miscellaneous items like correct date range, correct mean, correct SD, etc.
In: Statistics and Probability
Please solve in R
The data below were collected in petri dishes, with each dish having the given concentration of Cadmium Chloride (X) in solution, and the growth of algae cells in the was dish recorded after two weeks’ time. Input the data into R, creating a data frame. Provide me the commands you used and print your data frame to the screen and print it out to show me what it looks like. Then construct a scatter plot of the data, with a title and well-labeled axes (include units of measurement somewhere on the plot). Then, add a smooth fitted quadratic curve given by the equation
126.760+(-4.47126)*x+(0.0369577)*x2
Please make sure that all points and the entire curve fit on the plot. and provide the commands you used to construct your final graph, as well the graph itself.
Y = Algae growth (cells x 10^4/ml)
X = Cadmium Chloride (microgm/l)
X Y
0 120.9
0 118.0
0 134.0
5 121.2
5 118.6
5 120.4
10 82.6
10 62.8
10 81.6
20 49.3
20 41.6
20 41.3
40 12.7
40 14.7
40 4.7
80 4.9
80 4.0
80 4.4
In: Statistics and Probability
Throughout the course, you have studied and used tools and techniques that have underlying statistical theory and assumptions. Regression is no different. Haphazard application of regression analysis, as with any type of statistical technique, can lead to results that are inaccurate and that, even worse, can get you or your employer into trouble (whether that trouble involves product faults, legal issues, or simply wasted time and money). Thus, you must always be cognizant of the conditions of the problem as they relate to the assumptions and theory associated with your application of regression techniques.
Regression analysis is a statistical procedure, and it requires that certain assumptions be satisfied if you are to correctly interpret the results.
1. Which assumptions, if violated, can cause the greatest bias in the results of the regression analysis? Why?
In: Statistics and Probability
In an exclusive suburb of Chicago, 55% of the families are members of the golf course, 40% are members of the tennis club, and 15% are members of both the golf course and tennis club.
In: Statistics and Probability
****MUST BE FAMILIAR WITH R STUDIO PROGRAMMING*****
Random samples of resting heart rates are taken from two groups. Population 1 exercises regularly, and population 2 does not. The data from these two samples (in beats per minute) are given below:
Exercise group (sample from population 1): 62.4, 64.1, 66.8, 60.7, 68.2, 69.2, 64.9, 70.9, 67.7, 68, 58.5, 58.9, 64.7
No exercise group (sample from population 2): 79.3, 73.8, 75.3, 74.7, 76.9, 74.9, 73.2, 75.7, 75.2, 76.7, 78.7
Estimate the difference in mean resting heart rates between the two groups using a 9797% confidence interval.
Using a complete test of hypothesis at an α=0.03α=0.03 level of significance, is there evidence to conclude that those who exercise regularly have lower resting heart rates?
Use the approximate value for the degrees of freedom (the smallest between ?1 − 1 and ?2 − 1).
(a) Create two vectors to store the data. Find, store and display the mean, standard deviation and number of observations for each sample.
(b) Draw a properly labelled boxplot for each sample. Comment on the validity of the assumption of normality for these data.
(c) Construct a confidence interval for the difference in mean resting heart rates between the two groups. Assume normality, and use the level of confidence given to you in WW. Does the confidence interval support a difference between the two means? Explain.
(d) Test the claim that those who exercise regularly have lower resting heart rates than those who do not. Use the level of significance given to you in WW. You must include: i) null and alternative hypotheses; ii) test statistic; iii) P-value; iv) decision in term of the null hypothesis; v) decision in context.
(e) Create a properly labelled plot that includes the sampling distribution of the statistic under the null hypothesis, the value of the statistic as a vertical line, and the P-value.
In: Statistics and Probability
Segment profitability Calculation
(Please show both the calculation process and the final answer)
Q1 What is the margin ($ dollar value) of each segment (experientials, indulgents and frugals) per customer per year for Red Lobster according to the table below?
(hint: food margin($)for each customer+ alcohol margin($)for each customer)
Q2 Which segment is the most profitable and should be target at according to results in Q1?
Q3 Calculate each segment’s total margin change ($) if Red Lobster gain 2000 new unique Experiential customers, but lose 1000 Indulgent and 1000 Frugals.
Q4 Calculate the restaurant level total margin($)change if Red Lobster gain 2000 new unique Experiential customers, but lose 1000 Indulgent and 1000 Frugals.
Experientials |
Indulgents |
Frugals |
|
% of unique customers |
23% |
24% |
28% |
Meals/year/customer |
6.3 |
5.6 |
3.8 |
Total spend/meal/customer ($) |
24.88 |
18.78 |
14.86 |
% spend on food |
88% |
96% |
99% |
% spend on alcohol |
12% |
4% |
1% |
% Margin on food |
67% |
67% |
67% |
% Margin on alcohol |
81% |
81% |
81% |
Margin for each segment per customer per year($) |
??? |
??? |
??? |
Change in the number of customers |
2000 |
-1000 |
-1000 |
Margin Change in each Segment($ ) |
??? |
??? |
??? |
Total Restaurant Level Margin Change($) |
??? |
In: Statistics and Probability