Question

In: Math

Homework 5, MAT 327/782, Fall 2018 For the R computations, submit the R commands you used...

Homework 5, MAT 327/782, Fall 2018
For the R computations, submit the R commands you used and their output, either by taking a screenshot or by copying them into a text file. Submit your graph as a pdf or image file.
All graphs should have labeled axes and a title.
1. The R dataset nhtemp contains the mean annual temperature (in F) in New Haven, Connecticut from 1912 to 1971.
(a) In R, compute the mean, sample standard deviation, sample skewness, and sample kurtosis of nhtemp.
(b) What can you say about the spread and shape of the nhtemp data using the information from part (a)? Hint: Remember Chebychev’s rule and rules of thumb.
(c) Plot a histogram and boxplot of nhtemp. Are these plots what you expected from part (b)? Why or why not?
(d) nhtemp is a time series. Plot it as an index plot. Does this plot give any new information? Do the histogram and/or boxplot show anything about the data that is not seen or hard to see in the index plot?
2. The R dataset co2 contains 468 measurements of the amount of carbon dioxide (CO2) in the atmosphere. The measurements were taken monthly from 1959 to 1997, and are in parts per million (ppm).
(a) (b)
(c) (d)
3. (a)
(b)
In R, compute the mean, sample standard deviation, sample skewness, and sample kurtosis of co2.
What can you say about the spread and shape of the nhtemp data using the information from part (a)? Hint: Remember Chebychev’s rule and rules of thumb.
Plot a histogram and boxplot of co2. Are these plots what you expected from part (b)? Why or why not?
co2 is a time series. Plot it as an index plot. Does this plot give any new information? Do the histogram and/or boxplot show anything about the data that is not seen or hard to see in the index plot?
The R dataset islands contains the areas (in 1000s of square miles) of land- masses more than 10,000 square miles. Plot the boxplot. What are the outliers?
Classify the outliers in the islands dataset as potential or suspected.
4. For MAT 782 only. Prove that skewness is location and scale independent. That is, show that for data x1, x2, ..., xn and for any non-zero constants c, a ∈ R, if yi =cxi +aforall1≤i≤n,then
1 ?ni=1(xi − x ̄)3 1 ?ni=1(yi − y ̄)3 n s3 =n s3y
where y ̄ is the mean and sy is the standard deviation of y1, y2, ..., yn.

Solutions

Expert Solution

1) The R-Code to read the data:

year = c(1912:1971)
dt = data.frame(cbind(year,nhtemp))

1.a)  the mean, sample standard deviation, sample skewness, and sample kurtosis of nhtemp is obtained as:

mean(nhtemp)
51.16

sd(nhtemp)
1.265608

library(e1071)
skewness(nhtemp)
-0.07178758

kurtosis(nhtemp)
0.3832752

1.b) The skewness of mean annual temperature (in F) is -0.07178758 which indicates that the distribution of the mean annual temperature (in F) in New Haven, Connecticut is skewed towards the left. While, the kurtosis of mean annual temperature (in F) is 0.3832752, which indicates that the distribution of the mean annual temperature (in F) in New Haven, Connecticut is leptokurtic.

1.c) histogram and boxplot of nhtemp is given by:

hist(nhtemp, main = paste("Histogram of the Mean Annual Temperature (in F)
in New Haven, Connecticut"), xlab = "Mean Annual Temperature")

Thus, the histogram plot shows that the distribution is slightly skewed towards left and is leptokurtic.

boxplot(nhtemp, main = paste("Boxplot of the Mean Annual Temperature (in F)
in New Haven, Connecticut"))

From the boxplot we observe that there are 4 outliers in total, 2 of which are less than 3/2 times of the Inter Quartile Range and the other 2 is greater than 3/2 times of the Inter Quartile Range. The Inter Quartile Rangeis slightly shifted upwards indicating skewness towards left and leptokurtic.

1.d) The index Plot of nhtemp is given by:

plot(years, nhtemp, type = "l", main = paste("Index Plot of the Mean Annual Temperature (in F)
in New Haven, Connecticut"), xlab = "Years", ylab = "Mean Annual Temperature")

The index plot of the given data gives us the trend or pattern of the mean annual temperature where the boxplot and histogram where giving us the distribution of the mean annual. temperature.

2.a)  the mean, sample standard deviation, sample skewness, and sample kurtosis of co2 is obtained as:

mean(co2)
337.0535

sd(co2)
14.96622

library(e1071)
skewness(co2)
0.2419156

kurtosis(co2)
-1.223013

2.b) The skewness of the amount of carbon dioxide (CO2) in the atmosphere is 0.2419156 which indicates that the distribution of the amount of carbon dioxide (CO2) in the atmosphere is skewed towards the right. While, the kurtosis of amount of carbon dioxide (CO2) in the atmosphere is -1.223013, which indicates that the distribution of the amount of carbon dioxide (CO2) in the atmosphere is platykurtic.

1.c) histogram and boxplot of co2 is given by:

hist(co2, main = paste("Histogram of the Amount of Carbon Dioxide
(CO2) in the atmosphere"), xlab = "Amount of Carbon Dioxide")

Thus, the histogram plot shows that the distribution is slightly skewed towards right and is platykurtic.

boxplot(co2, main = paste("Boxplot of the Amount of Carbon Dioxide
(CO2) in the atmosphere"))

From the boxplot we observe that there are no outliers. The Inter Quartile Rangeis slightly shifted downwards indicating skewness towards right and platykurtic.

2.d) The index Plot of nhtemp is given by:

plot(co2, type = "l", main = paste("Index Plot of the Amount of Carbon Dioxide
(CO2) in the atmosphere"), xlab = "Years", ylab = "Amount of Carbon Dioxide (CO2)")

The index plot of the given data gives us the trend or pattern of the Amount of Carbon Dioxide (CO2) in the atmosphere where the boxplot and histogram where giving us the distribution of the Amount of Carbon Dioxide (CO2).

3.a) The boxplot of co2 is given by:

boxplot(islands, main = paste("Boxplot of the Areas in thousands of square miles of the landmasses"))

From the boxplot we observe that there are many outliers. The Inter Quartile Range is totally shifted downwards indicating skewness towards right and platykurtic.

The outliers are :

Africa : 11506, Antarctica: 5500, Asia : 16988, Australia : 2968, Europe : 3745, North America : 9390.   


Related Solutions

Homework (Submit in groups) You are 20 and plan to work for 45 years until you...
Homework (Submit in groups) You are 20 and plan to work for 45 years until you retire at 65. You expect to live until you are 90. You will collect a pension. Your annual pension payment will be equal to your final salary times a 3% crediting rate times the number of years that you work. Your starting salary (paid at the end of the year) is $40,000. You expect to get a 4% annual raise. Your discount rate is...
PART 5: Learn about commands used to view contents of files: use the cat command to...
PART 5: Learn about commands used to view contents of files: use the cat command to review the contents of the /home/test/passwd.bak type:   cat passwd.bak 2. now add the |more to the last command (see what happens when you push the up arrow curser key-it recalls the last command) 3. now try to cat the passwd.bak file but look at the first few lines and then the last few lines using the head and tail commands type:  head passwd.bak   and    tail...
Using the R package to answer the following two questions. You MUST submit your R code...
Using the R package to answer the following two questions. You MUST submit your R code for analysis. 2. Below are heights for a simple random sample of n = 15 young trees (in cm). (50 pts) 27, 33, 33, 34, 36, 37, 39, 40, 40, 41, 41, 42, 44, 46, 47. (a) Test the hypothesis that the mean tree height is equal to 38 cm. (b) Calculate the 95% confidence interval for the population mean of young trees. (c)...
You are thinking about registering three courses in Fall 2018. There are three courses that you...
You are thinking about registering three courses in Fall 2018. There are three courses that you are interested in, which will be taught by three professors. Based on prior experiences and conversations with your colleagues with these courses, you have work a “complexity table” (shown below) that illustrated your expectation on the complexity of each course (the higher the number, the more complex you expect the course to be with that professor.) Course ? Professor 1 Professor 2 Professor 3...
HOMEWORK ASSIGNMENT # 1 Due Date: Tuesday, February 20, 2018 by 5:15 pm Required format: This...
HOMEWORK ASSIGNMENT # 1 Due Date: Tuesday, February 20, 2018 by 5:15 pm Required format: This assignment is worth 20 pts. You should use Microsoft Word or a similar typing program to write your answers to the questions below. Hand-written copies will be subject to a deduction of 5 pts. Take a photo of your graphs and paste them as a picture in your document or draw them using one of the drawing tools available in Excel or Word. All...
Homework 12 Even if you have already used The Economist as a source for an article...
Homework 12 Even if you have already used The Economist as a source for an article summary, this homework assignment asks you to use this source and submit a review of an article from The Economist as a homework assignment. Same "rules" as for all article summaries apply. Why is this a Homework exercise? Because I think The Economist is probably the most comprehensive, politically neutral publication available, and I want to make sure you are exposed to it.
The homework from Week 5 had you identify a product to sell, as well as a...
The homework from Week 5 had you identify a product to sell, as well as a potential customer. You identified factors which would influence the purchase of your product and the types of information you would need to plan your sales presentation. Now, the next step for this week: Assume you have gained an appointment with your customer. How would you open your meeting with this person (the Approach)? List the statements, questions or describe any demonstrations you would make...
Problem 5: writing a faster implementation For this homework assignment, you will write a faster implementation...
Problem 5: writing a faster implementation For this homework assignment, you will write a faster implementation of do_insertions_simple (which we will call do_insertions_fast). We won't need anything extra (no special modules, no advanced algorithms, no Numpy) in order to obtain a considerable speedup. Let's think about what makes do_insertions_simple slow, and about how we can rewrite the whole thing in a faster way. The biggest problem with do_insertions_simple is that it calls insert once for every element of the insertions...
Homework 5: Assume you work for the “Life is Good” T Shirt Company. In an effort...
Homework 5: Assume you work for the “Life is Good” T Shirt Company. In an effort to keep up with demand, the company has expanded facilities and purchased state of the art equipment to print t-shirts. The new equipment cost $980,000. There were additional expenditures of $25,000 for transportation to the facility and transport insurance. Additionally, a service and warranty policy was signed for the equipment which will cost $1800 a year for the next 5 years. The salvage value...
More than anything I need 5 - 7 of this homework. You have been asked by...
More than anything I need 5 - 7 of this homework. You have been asked by your supervisors at A&L Engineering to design a roller coaster for a new theme park. Because this design is in the initial stages, you have been asked to create a track for the ride. Your coaster should have at least two peaks and two valleys, and launch from an initial height of 75 meters. Each peak and valley should represent a vertical change of...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT