In: Computer Science
During class we used R for our practical data mining applications.
(a) We often used objects of class data.frame as data structures. Explain the key features of the data.frame objects. What are some of its strengths and weaknesses?
(b) Which elements (i.e., rows and columns) of the mtcars data.frame does the following command return?
mtcars[mtcars$mpg < mean (mtcars$mpg), c ("mpg" , "hp" )]
(c) What is an R package?
(e) What command would you use to fit a logistic regression model?
(f) Write the first six lines of a hypothetical external data file named data.csv that will be read without error by the following R command:
mydata <- read.table ("data.csv" , skip=3 , header=FALSE , sep="," ,
colClasses=c ("character" , rep ("numeric" , 3 ) ) )
Lines of data.csv:
1 :
2 :
3 :
4 :
5 :
6 :
Ans (a):- A data frame is a table or a two-dimensional array-like structure in which each column conatins values of one variable and each row conatins one set of values from each column.
Key features of dataframe are:-
Ans (b):- R package are collections of fucntions and data sets developed by the community. they increase the power of R by improving existing base R functionalities, or by adding new ones. For example, if you are usually working with data frames, probably you will have heard about dplur or data.
Ans (e):- This way, you tell glm() to put fit a logistic regression model instead of one of the many other models that can be fit to the glm. As you can see, summary() returns the estimate, standard errors, z-score, and p-values on each of the coefficients.
Just as ordinary least square regression is the method used to estimate coefficients for the best fit line in linear regression, logistic regression uses meximum likelihood estimate (MLE) to obtain the model coefficient that relate predictors to the target.