In: Computer Science
R programming language.
The seafood data recorded the bacterial growth on oysters and
mussels. The bacterial counts are measured in log scale at three
different times.
Convert the data to have one variable showing the bacterial counts
and one variable showing the different time points.
Seafood | time0 | time1 | time2 |
Oysters | 1.83 | 3.68 | 5.23 |
Oysters | 1.11 | 3.25 | 4.98 |
Oysters | 2.01 | 3.98 | 5.02 |
Oysters | 7.19 | 7.42 | 9.32 |
Oysters | 6.89 | 7.34 | 8.92 |
Oysters | 7.01 | 8.11 | 9.99 |
Oysters | 6.47 | 9.44 | 9.78 |
Oysters | 5.98 | 8.77 | 9.88 |
Oysters | 5.45 | 8.12 | 9.35 |
Mussels | 0.36 | 1.70 | 4.57 |
Mussels | 0.76 | 1.65 | 4.52 |
Mussels | 0.56 | 1.55 | 4.23 |
Mussels | 5.01 | 6.38 | 7.95 |
Mussels | 5.00 | 6.87 | 7.99 |
Mussels | 5.03 | 6.42 | 8.01 |
Mussels | 5.05 | 9.78 | 11.04 |
Mussels | 5.89 | 9.99 | 11.35 |
Mussels | 5.97 | 9.86 | 11.22 |
Hey,
For answering this one I have used tidyr library
package in my program.
The "tidyr" library has a function named
gather() which
can be used in reshaping data frames from wide format to long
format.
A Brief
description:
For an example, say a data which is considered
unstacked and a common attribute of concern is spread out across
columns. To reformat the data such that these common
attributes are gathered together as a single
variable, the gather()
function will take multiple columns and collapse them into
key-value pairs, duplicating all other columns as needed.
HERE COMES THE
WORKING CODE FOR YOU (with comments for your
understandibility) :-
library(tidyr) #loading the tidyr package
#defining the original data
data <- data.frame(Seafood = c("Oysters", "Oysters", "Oysters",
"Oysters", "Oysters", "Oysters", "Oysters", "Oysters", "Oysters",
"Mussels" ,"Mussels", "Mussels", "Mussels", "Mussels", "Mussels",
"Mussels", "Mussels", "Mussels"), # Create example data frame
time0 = c(1.89, 1.11, 2.01, 7.19,6.89,7.01,6.47, 5.98, 5.45, 0.36,
0.76, 0.56, 5.01, 5.00, 5.03, 5.05, 5.89, 5.97),
time1 = c(3.68, 3.25, 3.98, 7.42,7.34,8.11,9.44, 8.77, 8.12, 1.70,
1.65, 1.55, 6.38, 6.87, 6.42, 9.78, 9.99, 9.86),
time2 = c(5.23, 4.98, 5.02, 9.32,8.92,9.99,9.78, 9.88, 9.35, 4.57,
4.52, 4.23, 7.95, 7.99, 8.01, 11.04, 11.35, 11.22))
data$Seafood <- as.character(data$Seafood) #First column is a
character type
data$time0 <- as.numeric(data$time0) #All the time
columns(time0, time1, time2) are of numeric type
data$time1 <- as.numeric(data$time1)
data$time2 <- as.numeric(data$time2)
print("The Original Data::\n")
data #printing the original data frame
mod_DF <- gather(data,TimeStamp, BacteriaCount, time0:time2)
#using gather function to convert the original data variables and
storing the reshaped data in mod_DF
print("The Resultant Modified Data with Bacteria Counts and
DIfferent Time Stamps::\n")
mod_DF #printing the resultant reshaped data frame
OUTPUT
WINDOW:
Since the resultant data frame is long i.e. having 54 rows so I
have to paste the output snapshots in 3 parts.
SNAP
1:
SNAP
2:
SNAP
3:
Hope this is helpful, Have a great day!!! :)