Question

In: Statistics and Probability

2.194 Comparing Global Internet Connections. In Exercise 2.120 we discuss a study in which the Nielsen...

  1. 2.194 Comparing Global Internet Connections. In Exercise 2.120 we discuss a study in which the Nielsen Company measured connection speeds on home computers in nine different countries in order to determine whether connection speed affects the amount of time consumers spend online. Table 2.29 (by exercise 2.194 in the textbook) shows the percent of Internet users with a “fast” connection (defined as 2Mb or faster) and the average amount of time spent online, defined as total hours connected to the Web from a home computer during the month of February 2011. The data are also available in the dataset GlobalInternet. We are interested in looking at whether the percent of Internet users with a fast internet connection predicts the number of hours spent online.
    1. What is the explanatory variable? What is the response variable? Is each categorical or quantitative?




    2. What would a positive association mean between these two variables? Briefly explain why a positive relationship might make sense in this context.








    3. What would a negative association mean between these two variables? Briefly explain why a negative relationship might make sense in this context.






    4. Make a scatterplot with regression line, using connection speed as the explanatory variable and time online as the response variable. Include a screen shot of your Statkey output with your homework submission. What is the correlation?

    5. Are there any outliers? If so, indicate the country associated with each outlier and describe the characteristics that make it an outlier for the scatterplot.
    6. Eliminate the outliers you identified, and generate another scatterplot with regression line. Is the correlation strongly affected by the outliers?



    7. Find the regression equation for these variables with the outliers removed. Include a screen shot of your Statkey scatterplot with regression line with your homework submission, and write the regression equation here.
    8. Interpret the slope of the regression line in context. Remember that an interpretation must include a number and units, in a sentence that explains the meaning.
    9. Use the regression line to predict the number of hours spent online in Febrary 2011 in a country in which 65% of Internet users have a fast connection. Show your calculations.

Solutions

Expert Solution

a)

An explantory variable is type of independent variable

An response variable is type of dependent variable

Here response variable  variable is hours online and explantory variable is percent fast connection.

Categorical variables take category or label values and place an individual into one of several groups

quantitative variable can take on multiple numerical values.

Each variable is quantative variable

b)

Positive association means that

If one variable will increase other variable also will increase or as one variable will be decreased other variable also be decreased

c)

Negative  association means that

If one variable will increase other variable will decrease or as one variable will be decreased other variable will  be increased

d)

We will plot the scatter plot using the minitab with below mentioned steps

Enter data into minitab >Graph>Scatterplot>Simple>Ok>Y variable:time online>X variable:Connection speed>Ok

Scatter plot is shown below

From the above scatter plot we can observe that

There exists the positive relationship between the two variables

e)

There exists two outliers .Two extreme points of scatter plot graph are outliers

which are first and last country is an outlier

Because they are very distant from the other data points

So Switzerland and Brazil are Outliers

f)

If we will eliminate an outlier from the scatterplot

then there is positive relationship between these two variables.

We have to check whether correlation is affected by outliers

Let us find the correlation using  the MINITAB

ENTER data into MINITAB sheet > Stat > Basic Statistics > Correlation > Variables : select both the variables > do not click on display p-values > ok

Correlations: Percent Fast Connection, Hours Online

Correlation coefficient of Percent Fast Connection and Hours Online = -0.64

It is correlation with outlier.

Correlation without outlier = 0.95

There exists significant difference between correlation coeffcient

Yes correlation coefficient is strongly affected by outliers

Please post seperately for other parts


Related Solutions

In this exercise, the first of a series, we will make connections between the physics you...
In this exercise, the first of a series, we will make connections between the physics you have been learning (in this case, kinematics) and how it is used by people in their work and research. Today we consider an example from kinesiology research, based on a 2000 paper from the Journal of Measurement in Physical Education and Exercise Science (nota bene: you do not have to read this paper, or any of the hyperlinks to follow the exercise – it...
In this exercise we are interested in comparing the proportion of those who support immigrant policy...
In this exercise we are interested in comparing the proportion of those who support immigrant policy reform among Hispanics and non-Hispanics. A poll found that out of 800 Hispanics 77% supported immigration reform and out of 900 non-Hispanics 64% supported immigration reform. Test the hypothesis that Hispanics are more likely to support immigration reform than non-Hispanics at alpha = 0.01.
Risk Assessment- Exercise #2 Social Konnections Inc. (SKI or the “Company”) is a global Internet company...
Risk Assessment- Exercise #2 Social Konnections Inc. (SKI or the “Company”) is a global Internet company that runs Social Konnections, a large social media networking Web site. SKI has experienced steep growth since its launch in 2005, and the Company went public in 2007. SKI currently has over 500 million active users who visit the site to connect with others, express themselves, and play games. Last year, substantially all of SKI’s revenue came from advertisers who market their products and...
Risk Assessment- Exercise #1 Social Konnections Inc. (SKI or the “Company”) is a global Internet company...
Risk Assessment- Exercise #1 Social Konnections Inc. (SKI or the “Company”) is a global Internet company that runs Social Konnections, a large social media networking Web site. SKI has experienced steep growth since its launch in 2005, and the Company went public in 2007. SKI currently has over 500 million active users who visit the site to connect with others, express themselves, and play games. Last year, substantially all of SKI’s revenue came from advertisers who market their products and...
Risk Assessment- Exercise #1 Social Konnections Inc. (SKI or the “Company”) is a global Internet company...
Risk Assessment- Exercise #1 Social Konnections Inc. (SKI or the “Company”) is a global Internet company that runs Social Konnections, a large social media networking Web site. SKI has experienced steep growth since its launch in 2005, and the Company went public in 2007. SKI currently has over 500 million active users who visit the site to connect with others, express themselves, and play games. Last year, substantially all of SKI’s revenue came from advertisers who market their products and...
Risk Assessment- Exercise #1 Social Konnections Inc. (SKI or the “Company”) is a global Internet company...
Risk Assessment- Exercise #1 Social Konnections Inc. (SKI or the “Company”) is a global Internet company that runs Social Konnections, a large social media networking Web site. SKI has experienced steep growth since its launch in 2005, and the Company went public in 2007. SKI currently has over 500 million active users who visit the site to connect with others, express themselves, and play games. Last year, substantially all of SKI’s revenue came from advertisers who market their products and...
Comparing exercise and risk of Coronary Heart Disease(CHD), total of 2326 registered women in study and...
Comparing exercise and risk of Coronary Heart Disease(CHD), total of 2326 registered women in study and divided into groups based on moderate exercise, strenuous exercise, and whether they developed some sort of CHD. CHD NOT CHD TOTALS Strenuous(SE) 65 1003 1068 Moderate(Not SE) 46 1212 1258 Totals 111 2215 2326 Using above table, find 90% confidence intervals for proportion of CHD in each group (eg. strenuous or moderate exercise). Do the confidence intervals overlap? What does this tell you about...
Suppose we are comparing average scores on a test for two groups that used different study...
Suppose we are comparing average scores on a test for two groups that used different study techniques. The null hypothesis is that μ1 - μ2 = 0 A. If μ1 = 84 and μ2 = 80, which of the following are true (select all that apply)? If we fail to reject H0, we are making the correct decision If we fail to reject H0, we are committing a Type I error If we fail to reject H0, we are committing...
Comparing Objects Vs Comparing Primitive data Types Topic: Discuss how do we compare Strings for equality?...
Comparing Objects Vs Comparing Primitive data Types Topic: Discuss how do we compare Strings for equality? Provide coding examples. How String comparison is different than the primitive data type comparison? Please cite any resources you used.
A study is run comparing HDL cholesterol levels between men who exercise regularly and those who...
A study is run comparing HDL cholesterol levels between men who exercise regularly and those who do not. The data are shown below. Regular Exercise N Mean Std Dev Yes 35 48.5 12.5 No 120 56.9 11.9 Generate a 95% confidence interval for the difference in mean HDL levels between men who exercise regularly and those who do not.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT