Question

In: Statistics and Probability

Built in Data In R: This Question uses "cystfibr" data found in "ISwR" package. to access...

Built in Data In R:

This Question uses "cystfibr" data found in "ISwR" package. to access this data you need to first install "ISwR" package. then load the library. Type data() to check which built in data are in the package "ISwR". This should show all the available built in data as: We use nickel data for this part. Type >cystfibr to see the data, and then answer the following questions using the data: (a) type ?cystfibr this will open up a help file explaining about the 'cystfibr' data. What is cystfibr data about? (b) How many Males and how many Females are in the study? (c). Construct a bar diagram of the male and female. Change color to Red. {hint: barplot(table(sex))} (d.) Calculate the average height of the participants (e). Calculate the variance of the weight of the participants. (f). Calculate the Standard Deviation of the weight of the participants. (g). Construct a histogram of weight of the participants. (h). Construct a scattered plot height and weight [hint: plot(height, weight)]

Solutions

Expert Solution

R-commands and outputs:

# Install package
# Load package ISwR
library(ISwR)

(a)
?cystfibr
# Description: The cystfibr data frame has 25 rows and 10 columns. It contains lung function data for cystic fibrosis patients (7–23 years old).

(b)
d=cysfibr
head(d)
# sex-a numeric vector code, 0: male, 1:female.

sum(d$sex) # This gives number of females
[1] 11
# On subtraction we obtain number of males(M), 25-11=14
# Number of males=14
# Number of females=11

(c)
# Bar diagram of male and female.
sex=d$sex
sex
[1] 0 1 0 1 0 0 1 1 0 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0 0

barplot(table(sex))
barplot(table(sex),xlab=c("0: Male","1: Female"),col=2)

(d)
# Average height of all the participants
height=d$height
mean(height)
[1] 152.8

(e)
# the variance of the weight of the participants
weight=d$weight
var(weight)
[1] 320.3429

(f)
# the Standard Deviation of the weight of the participants
sd(weight)
[1] 17.89813

(g)
# a histogram of weight of the participants.
hist(weight)

(h)
# a scattered plot height and weight
plot(height, weight, main="Scattered plot of height vs weight")

Screenshot:


Related Solutions

a. In R there is an built in data frame Nile. This has the annual flow...
a. In R there is an built in data frame Nile. This has the annual flow in river Nile for year 1871 to 1971. Produce a time series plot. Print graph(s). b. add the title as "Nile River Annual Flow", x axis label as "Year" and y axis label as "Flow". Print graph(s). c. Add a horizontal line showing the average flow over these years. Print graph(s). d. Add text as: "Average Flow:" with the calculated average flow on the...
Instructions tell you how to get the data in R R has built in dataset called...
Instructions tell you how to get the data in R R has built in dataset called Iris. This famous (Fisher's or Anderson's) iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica. We are interested in estimating the length of Petal (Y) using the length of Sepal (X). First, load the...
What is meant by “package access”? What is a “package”? What is meant by “classpath”?
What is meant by “package access”? What is a “package”? What is meant by “classpath”?
R has a number of datasets built in. One such dataset is called mtcars. This data...
R has a number of datasets built in. One such dataset is called mtcars. This data set contains fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models) as reported in a 1974 issue of Motor Trend Magazine. We do not have to read in these built-in datasets. We can just attach the variables by using the code attach(mtcars) We can just type in mtcars and see the entire dataset. We can see the variable...
In R, Use library(MASS) to access the data sets for this test. Use the Pima.tr data...
In R, Use library(MASS) to access the data sets for this test. Use the Pima.tr data set to answer questions 1-5. What is the average age for women in this data set? What is the maximum number of pregnancies for women in this data set ? What is the median age for women who have diabetes? What is the median age for women who do not have diabetes? What is the third quartile of the skin variable?
Using the package “wooldridge’, and the data ‘hprice1’ (in R-Software) to estimate the model price =...
Using the package “wooldridge’, and the data ‘hprice1’ (in R-Software) to estimate the model price = β0 + β1sqrft + β2bdrms + u , where is the house price measured in thousands of dollars. 1. Write out the results in equation form. 2.  What is the estimated increase in price for a house with one more bedroom, holding square footage constant? 3. What is the estimated increase in price for a house with an additional bedroom that is 140 square feet...
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In...
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In the study, three treatments (Treat) were applied to groups of young female anorexia patients, and their weights before (Prewt) and after (Postwt) treatment were recorded. The three treatments adminstered were no treatment (Cont), Cognitive Behavioural treatment (CBT), and family treatment (FT). Determine at the 5% significance level if there is a difference in mean weight gain between those receiving no treatment and those receiving...
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In...
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In the study, three treat- ments (Treat) were applied to groups of young female anorexia patients, and their weights before (Prewt) and after (Postwt) treatment were recorded. The three treatments adminstered were no treatment (Cont), Cognitive Behavioural treatment (CBT), and family treatment (FT). Determine at the 5% significance level if there is a difference in mean weight gain between those receiving no treatment and those...
Solve it by R Use the ‘cement’ dataset in ‘MASS’ package to answer the question. (1)...
Solve it by R Use the ‘cement’ dataset in ‘MASS’ package to answer the question. (1) Conduct the multiple linear regression, regress y value on x1, x2, x3 and x4 (without intercept). Report the estimated coefficients. Which predictor variables have strong linear relationship with response variable y at significance level 0.05? (2) What is the adjusted R square of your regression? What is the interquartile range (IQR) of the residuals from your regression? (3) Conduct a best subset regression (with...
In R: Consider dataset “juul” from library “ISwR”. (juul is a built in data set) Are...
In R: Consider dataset “juul” from library “ISwR”. (juul is a built in data set) Are the means of igf1 equal among tanner groups at 5% level? Please use the six step process to test statistical hypotheses for this research problem. Note: You need to convert tanner from numeric to factor type and ignore all the NAs.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT