Question

In: Computer Science

How do you load a simple data series onto R? For instance, data values that are...

How do you load a simple data series onto R? For instance, data values that are down below:

Below are the batting averages of 20 batting champions of the National League:

0.403 0.378 0.320 0.341 0.362 0.334 0.379 0.424 0.326 0.330 0.345 0.354 0.350 0.330 0.376 0.363 0.353 0.351 0.335 0.371

1. Construct a relative frequency histogram for the data.

2. What can you say about the shape of the histogram (modes, symmetry, outliers)?

Solutions

Expert Solution

How to load the series data in R?

Before we start, we first need to have data. The data can be saved in excel,SPSS or some other file type. When you svaed your data in a file you can add,change or edit the data whenever you want.

Before you load the data , we can check all constraints that satisfy. They are

  1. If you work with spreadsheets, the first row is usually reserved for the header, while the first column is used to identify the sampling unit;
  2. Avoid names, values or fields with blank spaces, otherwise each word will be interpreted as a separate variable, resulting in errors that are related to the number of elements per line in your data set;
  3. If you want to concatenate words, inserting a . in between to words instead of a space;
  4. Short names are prefered over longer names;
  5. Try to avoid using names that contain symbols such as ?, $,%, ^, &, *, (, ),-,#, ?,,,<,>, /, |, \, [ ,] ,{, and };
  6. Delete any comments that you have made in your Excel file to avoid extra columns or NA’s to be added to your file; and
  7. Make sure that any missing values in your data set are indicated with NA.

1.  Construct a relative frequency histogram for the data.

Given the series is 0.403 0.378 0.320 0.341 0.362 0.334 0.379 0.424 0.326 0.330 0.345 0.354 0.350 0.330 0.376 0.363 0.353 0.351 0.335 0.371

The sorted data is:0.320 0.326 0.330 0.330 0.334 0.335 0.341 0.345 0.350 0.351 0.353 0.354 0.362 0.363 0.371 0.376 0.378 0.379 0.403 0.424

If we start at0.3, and end at4.25, we can construct 5 classes of width0.25.

Theseclasses will include all the observations, and the first and last classes won’t be empty.

Class Frequency Relative frequency 0.3<=x<0.325 1 0.25

0.325<=x<0.350 7 0.35

0.350<=x<0.375 7 0.35

0.375<=x<4.00 3 0.15

4.00<=x<4.25 2 0.10

The below diagram is the relative frequency histogram

X-axis the relative frequency and y-axis is the frequency.

2.  What can you say about the shape of the histogram (modes, symmetry, outliers)?

A. The histogram is unimodal and roughly symmetric around 0.35. There do not appear to be any outliers.


Related Solutions

What are the values you get from "data[,1]" and "data[,2]" in r code? Are the values...
What are the values you get from "data[,1]" and "data[,2]" in r code? Are the values from "data[,1]" are the fitted values/ yhat values? When I try "fitted(data)" I get different values from "data[,1]", I am very confused.
Load “Lock5Data” into your R console. Load “OlympicMarathon” data set in “Lock5Data”. This data set contains...
Load “Lock5Data” into your R console. Load “OlympicMarathon” data set in “Lock5Data”. This data set contains population of all times to finish the 2008 Olympic Men’s Marathon. a) What is the population size? b) Now using “Minutes” column generate a random sample of size 5. c) Calculate the sample mean and record it (create a excel sheet or write a direct R program to record this) d) Continue steps (b) and (c) 10,000 time (that mean you have recorded 10,000...
Write code in R for this questions,, will vote!! Load the Taxi.txt data set into R....
Write code in R for this questions,, will vote!! Load the Taxi.txt data set into R. (a) Calculate the mean, median, standard deviation, 30th percentile, and 65th percentile for Mileage and TripTime. (b) Make a frequency table for PaymentProvider that includes a Sum column. Report the resulting table. (c) Make a contingency table comparing PaymentType and Airport. Report the resulting table. (d) Use the cor() function to find the correlation between each pair of the Meter, Tip, Mileage, and TripTime...
Please do these questions in the R language 1. Load the cars dataset into R. It...
Please do these questions in the R language 1. Load the cars dataset into R. It is a built-in dataset. 2. Do an str() to determine the number of observations and variables. Enter your answer as a comment. 3. Plot speed on x axis and distance on y axis. 4. Find the correlation between speed and distance. What does the magnitude and sign indicate? Enter your answer as a comment. 5. Build a linear regression model with speed as the...
Part (A): Using R to show how to decompose the seasonal time series data and then...
Part (A): Using R to show how to decompose the seasonal time series data and then subtract that effect from the data. This assignment is to practice using R to learn how to decompose seasonal time series data. Please use the data set in “A Little of R for Time Series” Section 2.4 (p.20): • > births <-scan("http://robjhyndman.com/tsdldata/data/nybirths.dat") > birthstimeseries <-  ts(births, frequency=12, start=c(1946,1)) > birthstimeseries • Then you can use the following to show how you can decompose the time...
** Number 2 implemented in R (R Studio) ** Set up the Auto data: Load the...
** Number 2 implemented in R (R Studio) ** Set up the Auto data: Load the ISLR package and the Auto data Determine the median value for mpg Use the median to create a new column in the data set named mpglevel, which is 1 if mpg>median and otherwise is 0. Make sure this variable is a factor. We will use mpglevel as the target (response) variable for the algorithms. Use the names() function to verify that your new column...
R Programming: Load the {ISLR} and {GGally} libraries. Load and attach the College{ISLR} data set. 1.2...
R Programming: Load the {ISLR} and {GGally} libraries. Load and attach the College{ISLR} data set. 1.2 Inspect the data with the ggpairs(){GGally} function, but do not run the ggpairs plots on all variables because it will take a very long time. Only include these variables in your ggpairs plot: “Outstate”,“S.F.Ratio”,“Private”,“PhD”,“Grad.Rate”. 1.3 Briefly answer: if we are interested in predicting out of state tuition (Outstate), can you tell from the plots if any of the other variables have a curvilinear relationship...
IN JAVA PLEASE Create a class called Child with an instance data values: name and age....
IN JAVA PLEASE Create a class called Child with an instance data values: name and age. a. Define a constructor to accept and initialize instance data b. include setter and getter methods for instance data c. include a toString method that returns a one line description of the child
In C++!!!! What do you do if you need to copy a 560x400mm image onto a...
In C++!!!! What do you do if you need to copy a 560x400mm image onto a standard sheet of US letter-size paper (which is about 216x280mm), while keeping the image as large as possible? You can rotate the image 90 degrees (so that it is in "landscape" mode), then reduce it to 50% of its original size so that it is 200x280mm. Then it will fit on the paper without overlapping any edges. Your job is to solve this problem...
CAN YOU PLEASE POST THE R-SCRIPT ONLY The built-in data set LakeHuron is a time series...
CAN YOU PLEASE POST THE R-SCRIPT ONLY The built-in data set LakeHuron is a time series which provides records of annual measurements of the level, in feet, of Lake Huron 1875 to 1972. Using R we can convert this data into the vector x by the assignment x<-as.vector(LakeHuron). Assume that the n measurements x=( x1, x2,...,xn) are a random sample from a population with true unknown mean μ and true unknown variance σ2. Remember, let x be defined by x<-as.vector(LakeHuron)...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT