Question

In: Statistics and Probability

Using R program and with a For loop. Assuming a data set of 1000 observations and...

Using R program and with a For loop.

Assuming a data set of 1000 observations and 10 predictors. How would one use a for loop to cycle through different proportions of training and test sizes.

For example,

20% of data goes to training and 80% for test in first iteration. Each iteration adding another 10% to the training set. So first set= (20% train, 80% test), second set = (30% train, 70% test), third set= (40% train,60%test) and so on.

How would something like this be done in R programming language with a For loop so there is 7 samples.

Solutions

Expert Solution


Related Solutions

A data set has 1000 observations. In the data, a quantitative variable's highest value is 780...
A data set has 1000 observations. In the data, a quantitative variable's highest value is 780 and its lowest value is 95. a) How many number of classes would you recommend? b) What is the class interval that you would recommend?
QUESTION Write the main while loop for a Java program that processes a set of data...
QUESTION Write the main while loop for a Java program that processes a set of data as follows: Each line of the input consists of 2 numbers representing a quantity and price. Your loop should: 1. Read in the input 2. Calculate the tax. Tax will be 8.875% of price and will only be applicable if price is more than 110. Calculate the new price which is the price plus tax. 3. Calculate the final price by multiplying quantity by...
Write R code: Here are the first six observations from the prostate data set found in...
Write R code: Here are the first six observations from the prostate data set found in the faraway library. Use help(prostate) to describe the dataset and the variables in the data sets. obs lcavol lweight age lbph svi lcp gleason pgg45 lpsa 1 -0.579819 2.7695 50 -1.38629 0 -1.38629 6 0 -0.43078 2 -0.994252 3.3196 58 -1.38629 0 -1.38629 6 0 -0.16252 3 -0.510826 2.6912 74 -1.38629 0 -1.38629 7 20 -0.16252 4 -1.203973 3.2828 58 -1.38629 0 -1.38629 6...
Given a vector of numeric values. with a R function using loop. testdouble(data). that returns TRUE...
Given a vector of numeric values. with a R function using loop. testdouble(data). that returns TRUE if all the even indexs elements of the vector are twice their preceding value, other wise your function returns FALSE. You can assume that the given vector has an even number of values. TRUE scenarios: c(3, 6, 5, 10, 11, 22, 13, 26) c(0, 0,1, 2, 2, 4, 3, 6) FALSE scenarios: c(3, 7, 5, 6, 11, 22, 13, 26) c(0, 2, 1, 2,...
Using R studio 1. Read the iris data set into a data frame. 2. Print the...
Using R studio 1. Read the iris data set into a data frame. 2. Print the first few lines of the iris dataset. 3. Output all the entries with Sepal Length > 5. 4. Plot a box plot of Petal Length with a color of your choice. 5. Plot a histogram of Sepal Width. 6. Plot a scatter plot showing the relationship between Petal Length and Petal Width. 7. Find the mean of Sepal Length by species. Hint: You could...
Write a program using c++. Write a program that uses a loop to keep asking the...
Write a program using c++. Write a program that uses a loop to keep asking the user for a sentence, and for each sentence tells the user if it is a palindrome or not. The program should keep looping until the user types in END. After that, the program should display a count of how many sentences were typed in and how many palindromes were found. It should then quit. Your program must have (and use) at least four VALUE...
How do you solve this using R? The file "flow-occ.csv" contains data collected by loop detectors...
How do you solve this using R? The file "flow-occ.csv" contains data collected by loop detectors at a particular location of eastbound Interstate 80 in Sacramento, California, from March 14-20, 2003. For each of three lanes, the flow (the number of cars) and the occupancy (the percentage of time a car was over the loop) were recorded in successive five-minute intervals. There were 1740 such five-minute intervals. Lane 1 is the farthest left lane, lane 2 is in the center,...
Using R: The data set “Drink.csv” represents the amount of bio medication filled in a sample...
Using R: The data set “Drink.csv” represents the amount of bio medication filled in a sample of 50 consecutive 2-liter bottles. 1) At the 0.01 level of significance, can you test whether the mean amount of medication is different from 2.0 liter using the critical value approach? What is the absolute value of the critical points? 2) Can you confirm your conclusion in part a using p value approach? Can you also replicate p value from t.test using the pt...
Please use R to do it. Using the SATGPA data set in Stat2Data package. Test by...
Please use R to do it. Using the SATGPA data set in Stat2Data package. Test by using α= .05 Question: Test if the proportion of MathSAT greater than VerbalSAT is 0.60 > library(Stat2Data) > data("SATGPA") > data(SATGPA) > SATGPA
Using R calculate the following properties of the Data Set given below: (a) The average (mean)...
Using R calculate the following properties of the Data Set given below: (a) The average (mean) value for each of the four features (b) (b) the standard deviation for each of the features (c) repeat steps (a) and (b) but separately for each type of flower (d) (d) draw four box plots, one for each feature, such that each figure shows three boxes, one for each type of flower. Properly label your axes in all box plots. Data Set {...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT