Question

In: Computer Science

Figure out that R and Rapid-Miner yield the similar number of records in the combined data...

Figure out that R and Rapid-Miner yield the similar number of records in the combined data set or not.

Solutions

Expert Solution

Hi,

R and Rapid miner are data mining tools.Raid miner requires advanced skills.There are certain aspect of rapid miner which is non convenient,but it allow us to keep conventional method by allowing us to fully integrate with standard methods.This is done by using help of rapid miners truly flexible integration with another popular open source data mining tool ,R.

R is powerful operator within the vast library of existing rapid miner studio operators. Both R and rapid miner contains many datas and attributes.In R the arguments of the script correspond to input ports where example sets are converted to data frames.The values returned by the script are delivered at the output port of the operator where data frames are converted to example set.In case of rapid miner the operator builts a merged example set from two or more compatible example sets by adding all examples into a combined set.All input sets must have same attribute signature.The merged example set is delivered as the output.

It is not necessary that R and rapid miner yield similar number of records in combined data set,because on passing data from rapid miner to R,it is nit necessary to pass all attribute ,we may want to select only some of the attributes to pass through to R or entire data set.Passing entire data set is very easy or if we want to select only selected attributes,it can be done by using "select attribute".

Hope you help this.

Thank you...


Related Solutions

How to use clustering operator to figure out results of data on rapid miner Dataset from...
How to use clustering operator to figure out results of data on rapid miner Dataset from US Airline Sentiment Analysis Any help would be much appreciated! I've been stuck on this for a long time. Thanks!
The following data is obtained from the general records in the shipping department at Rapid Parcel...
The following data is obtained from the general records in the shipping department at Rapid Parcel Delivery Company for August: Inspecting packages prior to placement on truck takes 1 minute per package. Processing paperwork for each shipment (each truck loaded) takes 30 minutes. Loading packages takes 2 minutes of labor per package plus an additional 3 labor hours per truck Miscellaneous tasks take 2 hours per truck. Temporary employees to perform all of the above tasks can be hired for...
How is this done in R-studio? I have been trying to figure it out but I...
How is this done in R-studio? I have been trying to figure it out but I am getting more and more confused. While imprisoned by the Germans during World War II, the English mathematician John Kerrich tossed a coin 10,000 times and obtained 5067 heads. Let p be the probability of a head on a single toss. We wish to check if the data are consistent with the hypothesis that the coin was fair. a) Set up the hypotheses. Why...
2. Probability (30%). Figure out the probability in thefollowing scenarios.(a) A number generator is...
2. Probability (30%). Figure out the probability in the following scenarios. (a) A number generator is able to generate an integer in the range of [1, 100], where each number has equal chances to be generated. What is the probability that a randomly generated number x is divisible by either 2 or 3, i.e., P(2 | x or 3 | x)? (5%) (b) In a course exam, there are 10 single-choice questions, each worthing 10 points and having 4 choices...
James is trying to figure out the optimal number of years to replace his machinery. The...
James is trying to figure out the optimal number of years to replace his machinery. The machinery will be replaced continously for the foreseeable future. The machinery costs $35,000. The maintenance costs for each respective year are shown in the table below together with the salvage value of the machinery if it were to be sold in that year. James has decided to keep the machinery for at least 4 years but is uncertain whether a 4 or 5 year...
Accounting records for High Life Corporation yield the following data for the year ended June
Accounting records for High Life Corporation yield the following data for the year ended JuneJune 30, 2018 (assume sales returns are non-existent):Journalize High Life's High Life's inventory transactions for the year under the perpetual system. (Record debitsInventory, June 30, 2017. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
** Number 2 implemented in R (R Studio) ** Set up the Auto data: Load the...
** Number 2 implemented in R (R Studio) ** Set up the Auto data: Load the ISLR package and the Auto data Determine the median value for mpg Use the median to create a new column in the data set named mpglevel, which is 1 if mpg>median and otherwise is 0. Make sure this variable is a factor. We will use mpglevel as the target (response) variable for the algorithms. Use the names() function to verify that your new column...
Accounting records for RockfordRockford Corporation yield the following data for the year ended JuneJune 3030?, 20162016...
Accounting records for RockfordRockford Corporation yield the following data for the year ended JuneJune 3030?, 20162016 ?(assume sales returns are? non-existent): Requirements 1. Journalize Rockford'sRockford's inventory transactions for the year under the perpetual system. 2. Report ending? inventory, sales, cost of goods? sold, and gross profit on the appropriate financial statement. Requirement 1. Journalize Rockford'sRockford's inventory transactions for the year under the perpetual system. ?(Record debits? first, then credits. Exclude explanations from any journal? entries.) The first transaction is the...
The Loblolly data in R has several variables pertaining to growth records for Loblolly pines, a...
The Loblolly data in R has several variables pertaining to growth records for Loblolly pines, a type of pine tree native to the Southeastern United States. Load this data in R and examine the help file with the following commands: data("Loblolly") ?Loblolly What are the variables in this dataset? Are they numeric or categorical? (6 pts) Boxplots In order to get more comfortable examining model assumptions, we’d like to get familiar with R’s plotting capailibites. We will start by examining...
How do I use the data analysis in Excel to figure these questions out? Suppose a...
How do I use the data analysis in Excel to figure these questions out? Suppose a researcher gathered survey data from 19 employees and asked the employees to rate their job satisfaction on a scale from 0 to 100 (with 100 being perfectly satisfied). Suppose the following data represent the results of this survey. Assume that relationship with their supervisor is rated on a scale from 0 to 50 (0 represents a poor relationship and 50 represents an excellent relationship);...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT