In: Statistics and Probability
The data can find in potuse (faraway package).
The national Youth Survey collected a sample of 11-17 year-olds with 117 boys and 120 girls, asking questions about marijuana usage. This data is actually longitudinal – the same boys and girls are followed for five years. However, for the purposes of this question, imagine that the data is cross-sectional, that is, a different sample of boys and girls are sampled each year. Build a model for the different levels of marijuana usage, describing the trend over time and the difference between the sexes.
USE R CODE and interpret
library(faraway)
library(tidyverse)
data1 = potuse
gg_interaction_plot <- function(data, formula) {
formula <- as.formula(formula)
y_var <- as.character(formula[2])
x_vars <- as.character(formula[3]) %>%
str_split(" \\+ ") %>% unlist()
data <- mutate_at(data, x_vars, as.factor)
shp_vars <- rev(x_vars)
map2(x_vars, shp_vars,
~ ggplot(data, aes_(y = as.name(y_var), x = as.name(..1), shape =
as.name(..2))) +
geom_point(position = position_jitter(width = .1)) +
stat_summary(fun.y = "mean", geom = "line",
aes_(group = as.name(..2), linetype = as.name(..2))) +
scale_shape_manual(values = 15:25) +
theme(legend.position = "top", legend.direction = "horizontal"))
%>%
cowplot::plot_grid(plotlist = ., ncol = 2)
}
glm(count~., data = data1)
Coefficients:
(Intercept) sex year.76 year.77 year.78 year.79 year.80
4.99794 0.01646 -0.66049 -0.55247 -0.42284 -0.35494 -0.27778
Degrees of Freedom: 485 Total (i.e. Null); 479 Residual
Null Deviance: 6591
Residual Deviance: 6227 AIC: 2635