Question

In: Computer Science

Drop any missing values using drop_na(). Then select all variables except name, mfr, type, weight, shelf,...

Drop any missing values using drop_na(). Then select all variables except name, mfr, type, weight, shelf, cups, rating to create a subset of several features we will use to cluster the different cereals variables.

I just need to know how to code the drop_na() and to create a subset of several features, which are the variables.

Missing info you need.

All the variables from the data.

name mfr           

type calories    

protein fat       

sodium fiber       

carbo sugars      

potass vitamins     

shelf weight cups      

rating

Solutions

Expert Solution

Assuming that there is a list, and the data is provided as a map with the keys as the variables. We will return a map with only the features (variables) required along with the list (subset).

Map<List> drop_na(List<HashMap> dataSet)        {
        Map<String, List> finalData = new HashMap<String, List>();
        String [] notRequired = {"name", "mfr", "type", "weight", "shelf", "cups", "rating"};
        for(HashMap data : dataSet)     {
                Set<String> keys = data.keySet();
                for(String key : keys)  {
                        boolean toInclude = true;
                        for(int i=0; i<notRequired.length; i++)      {
                                if(key.equals(notRequired[i]))  {
                                        toInclude = false;
                                        break;
                                }
                        }
                        if(toInclude)   {
                                if(finalData.get(key)==null)    {
                                        List temp = new ArrayList();
                                        temp.add(data.get(key));
                                        finalData.put(key, temp);
                                }
                                else    {
                                        List temp = finalData.get(key);
                                        temp.add(data.get(key));
                                }
                        }
                }
        }
        return finalData;
}

Related Solutions

6. Analyze and compare weight-loss diets Select any 3 specific (give the name) weight-loss diets and...
6. Analyze and compare weight-loss diets Select any 3 specific (give the name) weight-loss diets and compare them in relation to: a) What claims do they make? b) What principle/s are they based on? c) What are the risks each one poses? d) Do they include an exercise component/ e) Which, if any, have the best chance of working? Why or why not? f) Submit your results and your response to this activity for extra credit.
Using the table below calculate the missing values.
Using the table below calculate the missing values.YearNominal GDPReal GDPGDP Deflator1 (2007)3,753____________812 (2008)4,5515,009____________3 (2009)5,150____________100.004 (2010)5,6665,255____________5 (2011)____________6,175          158Enter the missing values : the inflation/growth rates as indicated below.Inflation rate year 1-2 ____________Inflation rate year 3-4 ____________Real growth rate yr. 2-3 ____________Real growth rate yr. 4-5 ____________
For the following types of values, designate discrete variables (D) and continuous variables (C): (a) weight...
For the following types of values, designate discrete variables (D) and continuous variables (C): (a) weight of the contents of a package of cereal, (b) diameter of a bearing, (c) number of defective items produced, (d) number of individuals in a geographic area who are collecting unemployment benefits, (e) the average number of prospective customers contacted per sales representative during the past month, (f) dollar amount of sales.
Part A. explain missing values in data and how to handle it Part B. Select true...
Part A. explain missing values in data and how to handle it Part B. Select true or false for the following questions One of the two possible causes or explanations for the differences that occur between groups or treatments in ANOVA is that the differences are due to treatment effects. T F Another possible cause or explanation for the differences that occur between groups or treatments in ANOVA is that the differences occur simply due to chance. T F Post...
Provide Mean, Median, Mode, Upper and Lower Limit Any null or missing values Detect any outliers...
Provide Mean, Median, Mode, Upper and Lower Limit Any null or missing values Detect any outliers or data points of interest Provide a paragraph for each column with facts from the previous 3 bullets and provide the business context behind the importance of the column, showing the analysis and how this helps you in the preparation stages. Month Year N Obs Variable Mean Std Dev Minimum Maximum Median N N Miss 1 2016 3584 Day_of_Week Items_In_Cart Visit_To_Site Made_Purchase Member 4.0231585...
A comprehensive budget may include any of the following components EXCEPT __________. Select one: a. a...
A comprehensive budget may include any of the following components EXCEPT __________. Select one: a. a specialized budget. b. a tax budget. c. a cash flow statement. d. an operating budget. e. a capital budget. Question 18 Investment risks include all the following EXCEPT __________. Select one: a. default risks. b. economic risks. c. industry and company risks. d. asset class risks e. market risks. Question 19 A kind of annuity that consists of cash flows of equal amounts occurring...
The electronegativity of the element affects all of the following except A- Type of bond it...
The electronegativity of the element affects all of the following except A- Type of bond it forms with other elements B- The boiling point of its compounds C- The shape of the molecule it forms D- The ability of its compounds to dissolve in different solvents
Find the missing values in the table below using Boyle's Law. The units for pressure and...
Find the missing values in the table below using Boyle's Law. The units for pressure and volume can be any units, as long as they agree. Show all work. p1 V1 P2 V2 a) 488 torr 531 mL 1.88 atm ? b) ? 2.44 L 18.0 psi 1.35 gal
Identify the population, variables, types of variables, type of sampling plan used, and any potential sources...
Identify the population, variables, types of variables, type of sampling plan used, and any potential sources of bias in the following situation. Researchers spend one Saturday night waiting outside of a bar to conduct a survey on attitudes towards drinking and driving. They ask every 5th person who comes out of the bar the number of drinks they had that night, their age, and if they believe that drinking and driving is a serious problem.
Risk factors in hypertensive patients consist of all of the following EXCEPT? Select all that apply....
Risk factors in hypertensive patients consist of all of the following EXCEPT? Select all that apply. Cigarette smoking Age and sex Current situation of prehypertension (120-125 / 80-85) Excessive dietary intake of potassium Obsesity Correct heparin administration involves: Never rotating injection sites Aspirating to assess for blood vessel entry Always giving heparin by deep IM injection Never rubbing the area after the injection The physician orders Ringer’s lactate solution to replace the fluid losses of a client. While the solution...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT