
In: Computer Science

how do you handle missing or correupt data in data set

how do you handle missing or correupt data in data set


Expert Solution

  • Datasets is a set of collected data that is designed into database tables. This datasets are used in machine learning or artificial intelligence. This data sets are used to train the machines.
  • Missing or corrupted data can occur in these data sets.
  • In data sets some values are missed ,so the missed values are called as missing data or corrupted data.

There are few solutions to handle the missing or corrupted data following as :

  • Deleting rows or columns of missed values : This is the first approach to handle the missed or corrupted data. It is correct approach to delete rows or columns where there are empty cells. By deleting the row or column, there is no appearance of missed data.
  • Replace with continuous values : This is another approach to handle missing or corrupted data.The empty cells are filled with any guessed values according to before rows or columns.So that there is no empty cells because they are filled.
  • Impute categorical columns : By this approach , a new category column is assigned. So that missing values are places with most frequent category value.
  • Predicting missing values : This is another approach to handle the missing data. Based on the filled values in a row or column some algorithms can predict the missing values. By this way missed values are filled.

Therefore, these are ways to handle the missing or corrupted data.

Related Solutions

Part A. explain missing values in data and how to handle it Part B. Select true...
Part A. explain missing values in data and how to handle it Part B. Select true or false for the following questions One of the two possible causes or explanations for the differences that occur between groups or treatments in ANOVA is that the differences are due to treatment effects. T F Another possible cause or explanation for the differences that occur between groups or treatments in ANOVA is that the differences occur simply due to chance. T F Post...
sing Data Set C, fill in the missing data.
(a) Using Data Set C, fill in the missing data. (Round your p-values to 4 decimal places and other answers to 2 decimal places.)   R2    ANOVA table   Source F p-value   Regression          Variables p-value   Intercept       Floor       Offices       Entrances       Age       Freeway       (b) The predictors whose p-values are less than 0.05 are (You may select more than one answer. Click the box with a check mark for the correct answer and double...
How does TCP handle interactive versus bulk data transfers? Why do you think this was done...
How does TCP handle interactive versus bulk data transfers? Why do you think this was done this way, rather than just a one-size fits all kind of mode? What would be a downside of either approach? Provide a detailed explanation/response.
How do you handle peak and non-peak times and how do you adjust between those times...
How do you handle peak and non-peak times and how do you adjust between those times (at work)?
How would you handle this situation? What would you advise Orlando to do in this meeting?...
How would you handle this situation? What would you advise Orlando to do in this meeting? What would you tell his manager, if anything? Are there any systemic changes you could think of that may help prevent instances like these from happening in the future? The annual reviews have just been completed, and you heard from Orlando Nicholson, an employee who had been with the firm for two years. Orlando has a cordial but distant relationship with his manager. He...
Describe how you should set-up and handle a treatment room when treating an individual with traumatic...
Describe how you should set-up and handle a treatment room when treating an individual with traumatic brain injury.
How do you do this? How do you set it up? One mole of a ideal...
How do you do this? How do you set it up? One mole of a ideal gas initially at temp To=0C undergoes an expansion at a constant pressure of 1atm to four times it original volume. (a) calculate the new temp Tf of the gas. (b) calulate the wok done by the gas during the expansion.
• What are good soldiers, loose canons, and grenades and how do you handle them? •...
• What are good soldiers, loose canons, and grenades and how do you handle them? • What are pragmatic, ethical, and strategic reasons for engaging in CSR?
Minimizing missing data: Here are some types of missing data that you might encounter when implementing...
Minimizing missing data: Here are some types of missing data that you might encounter when implementing a clinical trial. Pick two, and briefly describe a study procedure you could use to minimize the chance of that type of missing data occurring. 1. A participant does not show up for a study visit. 2. A participant does not bring important information (for example, a list of current medications or a pain diary that was supposed to be filled out). 3. Inadequate...
What are the mean, median, and mode of a set of data, and how do they...
What are the mean, median, and mode of a set of data, and how do they differ from each other? What are the different type measures of dispersion? Provide examples of each from your experience.