In: Math
R code with comments to plot the proportion of babies born with the name Angelica (all statements starting with # are comments and can be removed
#install the package for the first time
install.packages('babynames')
#load the package
library(babynames)
#print the variable names in the dataset babynames
names(babynames)
#create a subset of data with names Angelica
dat<-subset(babynames,name=="Angelica")
#plot the data
#create a window for 2 plots
par(mfrow=c(2,1))
plot(dat$year,dat$prop,xlab='Year',ylab="Proportion",main="Proportion
vs Year")
#get this
It looks like some years have 2 proportion for the name Angelica. Let us print some of those years
dat[dat$year>1970,]
#get this
We can see that for years 1973, 1974, 1975 the name Angelica has been given to both Male and Female babies.
Since we are want the proportion of all babies with name Angelica, we will sum these up.
R code
#aggregate the proportion for a year
dat1<-aggregate(prop ~ year, dat, sum)
#plot
plot(dat1$year,dat1$prop,xlab='Year',ylab="Proportion",main="Proportion
vs Year")
# get this
We can see that the name Angelica peaked at around year 1996