Question

In: Computer Science

Use the Galton dataset from the mosaicData package in R STUDIO library(mosaic) Create a scatter plot...

  1. Use the Galton dataset from the mosaicData package in R STUDIO

library(mosaic)

  1. Create a scatter plot to show the relationship between height against father’s height (x=father, y=height)

  2. What relationship did you see? (Use comments to write in your R Markdown file)

  3. Separate your plot into facets by sex

  4. Add a regression line using the “lm” method to both of your facets

  5. Generate a box plot of height by sex.

  1. Use the RailTrail data from the mosaicData package

library(mosaic)

  1. Generate a scatter plot to show the relationship between the number of crossings per day volume against the high temperature that day

  2. Separate your plot to facets by weekday

Solutions

Expert Solution

library(mosaic)
library(ggplot2)
head(Galton)
###part a
p <- ggplot(Galton, aes(x=father, y=height)) + geom_point(shape=1)
p
###part b
### from scatter plot, we can see fathers height and height have +ve corelation and are nearly
### in linear relation

###part c
p + facet_grid(sex ~ .)   

####part d
p+facet_grid(sex ~ .) +geom_smooth(method="lm")

######part e

ggplot(Galton, aes(x=sex, y=height)) + geom_boxplot(notch=TRUE)


library(mosaic)
library(ggplot2)
head( RailTrail )
###part a
p <- ggplot(RailTrail , aes(x=hightemp , y=volume )) + geom_point(shape=1)
p

###part b
p + facet_grid(weekday~ .)   




Related Solutions

Does anyone know the code to use in R programming to create a scatter plot?
Does anyone know the code to use in R programming to create a scatter plot?
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In...
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In the study, three treatments (Treat) were applied to groups of young female anorexia patients, and their weights before (Prewt) and after (Postwt) treatment were recorded. The three treatments adminstered were no treatment (Cont), Cognitive Behavioural treatment (CBT), and family treatment (FT). Determine at the 5% significance level if there is a difference in mean weight gain between those receiving no treatment and those receiving...
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In...
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In the study, three treat- ments (Treat) were applied to groups of young female anorexia patients, and their weights before (Prewt) and after (Postwt) treatment were recorded. The three treatments adminstered were no treatment (Cont), Cognitive Behavioural treatment (CBT), and family treatment (FT). Determine at the 5% significance level if there is a difference in mean weight gain between those receiving no treatment and those...
Using the Patients dataset, create a scatter plot (similar to Figure 13.5) with patient’s age on...
Using the Patients dataset, create a scatter plot (similar to Figure 13.5) with patient’s age on the x-axis and length of stay on the y-axis. Make sure that you fully label this chart (title for the chart, x-axis, and y-axis). (5 points) Follow the directions in EG13.2 (Excel Guide) at the end of Chapter 13 and create a linear trend line along with the linear regression equation and R-squared value. Interpret both the linear regression equation and the R-squared value....
2. The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study....
2. The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In the study, three treatments (Treat) were applied to groups of young female anorexia patients, and their weights before (Prewt) and after (Postwt) treatment were recorded. The three treatments adminstered were no treatment (Cont), Cognitive Behavioural treatment (CBT), and family treatment (FT). Determine at the 5% significance level if Cognitive Behavioral treatment is effective in helping patients gain weight. Perform all necessary steps for...
Load the package nycflights13 with library(nycflights13). If you are on running R Studio locally, you must...
Load the package nycflights13 with library(nycflights13). If you are on running R Studio locally, you must install this package before you can use it! # install.packages("nycflights13") library(nycflights13) library(ggplot2) library(dplyr) data(flights) data(airports) data(airlines) Question 2 The dataset `airlines` contains the full name of the carrier (examine it!). Join the dataset with the flights dataset so all of the information in `flights` is retained. Using the merged dataset, which carrier (`name`) has the longest average departure delay? Which has the shortest?
Install and load the dataset named Carseats (in the ISLR package) into R. Create a new...
Install and load the dataset named Carseats (in the ISLR package) into R. Create a new dataframe that is a copy of Carseats. Create two indicator (dummy) variables: Bad_Shelf = 1 if ShelveLoc = “Bad”, 0 otherwise Good_Shelf = 1 if ShelveLoc = “Good”, 0 otherwise Also, create two interaction variables: Price_Bad_Shelf = Price* Bad_Shelf Price_Good_Shelf = Price* Good_Shelf For Questions 1-2, please estimate a linear regression model (using the lm function) with Sales as the dependent variable and Price,...
install.packages("mosaic") library(mosaic) Data=(RailTrail) RailTrail above is the data set it can be found in R (a)...
install.packages("mosaic") library(mosaic) Data=(RailTrail) RailTrail above is the data set it can be found in R (a) Perform multivariate regression model that can predict the variable volume based on the variables hightemp, lowtemp, cloudcover, precip,. Interpret and discuss all the necessary statics from the output. (b) Test whether cloudcover can be dropped from the regression model given that precipitation, hightemp, and lowtemp are retained. Use the F statistic and level of significance 0.01. State the hypotheses, p-value, and conclusion in terms...
There are four numeric columns in R programming language's iris data set. Create a scatter plot...
There are four numeric columns in R programming language's iris data set. Create a scatter plot between the four numeric columns using R programming language and give answers to the following parts. Calculate the correlation between each pair of the four numeric columns in iris. Which pair of variables has the strongest linear relationship? Interpret their ??. Which pair of variables has the weakest linear relationship? Interpret their ??. Which pair(s) of variables can you conclude have a population correlation...
3. (Exercise 3.4) Use the Marriage data from the mosaicData package a) Create a bar plot...
3. (Exercise 3.4) Use the Marriage data from the mosaicData package a) Create a bar plot to show the frequency counts for each race b) Create a histogram to show the age distribution c) What distribution can you see for age? (Use comments to write in your R Markdown file) d) Create a time-series plot to show the delay by ceremony date (Note: you need to create a vector x<-1:98 first, and then create a new data frame with x=x...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT