In: Computer Science
Why does this code fail? How can you fix it.**
```{r}
table4a %>% gather(1999,2000,key="year",value="cases")
```
Answer:
**2) Tidy the simple tibble below. Do you need to spread or
gather it? What are the variables?**
```{r}
preg <-tribble(~pregnant, ~male, ~female,
"yes", NA,10,
"no" ,20,12)
preg
```
**3) What do the extra and fill arguments do in separate()?
Experiment with the various options for the following two toy
datasets?**
```{r}
tibble(x=c("a,b,c","d,e,f,g","h,i,j"))%>%
separate(x,c("one","two","three"))
tibble(x=c("a,b,c","d,e","f,g,i"))%>%separate(x,c("one","two","three"))
```
**4) Both unite and separate have a remove argument. What does it do? Why would you set it to FASLE?**
Solve all please with explanations
Q1) Why does this code fail? How can you fix
it.
```{r}
table4a %>% gather(1999,2000,key="year",value="cases")
```
Answer:---
Because `gather` can't find the columns names. You can't name
columns w/ numbers in R without quoting them with tick
marks.
Q2) Tidy the simple tibble below. Do you need to spread
or gather it? What are the variables?**
```{r}
preg <-tribble(~pregnant, ~male, ~female,
"yes", NA,10,
"no" ,20,12)
preg
```
Answer:--------
The main objective of analysis here is whether pregnant or not (bc
males can not be pregnant), so I would go for `gather`ing the
gender column rather than spreading the pregnant
column.
```{r}
preg %>%
gather(gender, values, -pregnant)
# the other way around:
preg %>%
gather(gender, values, -pregnant)
%>%
spread(pregnant, values)
```
Q3) What do the extra and fill arguments do in
separate()? Experiment with the various options for the following
two toy datasets?**
```{r}
tibble(x=c("a,b,c","d,e,f,g","h,i,j"))%>%
separate(x,c("one","two","three"))
tibble(x=c("a,b,c","d,e","f,g,i"))%>%separate(x,c("one","two","three"))
```
Answer:-----
It's simple. x has vectors with 3 and 4 characters but we specify 3
columns. `fill` has three values:
`warn`, `right` and `left`. Here
I specify a fourth column to place the extra letter. The first
fills the missing values with the extra character using the right
most match. `left` does the same thing but without
a warning. and left places the extra character empty in the first
column
```{r}
tibble(x = c("a,b,c", "d,e,f,g", "h,i,j")) %>%
separate(x, c("one", "two", "three", "four"))
tibble(x = c("a,b,c", "d,e,f,g", "h,i,j")) %>%
separate(x, c("one", "two", "three", "four"), fill = "right")
tibble(x = c("a,b,c", "d,e,f,g", "h,i,j")) %>%
separate(x, c("one", "two", "three", "four"), fill = "left")
```
I've deleted the fourth column to see how this works.
`extra` on the other hand, deals with either
droping or merging the extra characters. `warn`
drops the extra character and emits a warning messge.
`drop` does the same thing but without a warning
and `merge` merges the extra character to it's
closest end. No aparent option to `merge` with the first column
rather than the last.
```{r}
tibble(x = c("a,b,c", "d,e,f,g", "h,i,j")) %>%
separate(x, c("one", "two", "three"), extra = "warn")
tibble(x = c("a,b,c", "d,e,f,g", "h,i,j")) %>%
separate(x, c("one", "two", "three"), extra = "drop")
tibble(x = c("a,b,c", "d,e,f,g", "h,i,j")) %>%
separate(x, c("one", "two", "three"), extra = "merge")
Q4) Both unite and separate have a remove argument. What
does it do? Why would you set it to FASLE?**
Answer:-------
Because `unite` and `separate` receive columns and create new ones,
`remove` allows you to remove the original columns that you
unite/separate on. You might want to leave them as they are if
you're checking whether the transformation was done correctly.