Question

In: Statistics and Probability

What is the criteria for discarding a data point which you feel may be an untrustworthy...

What is the criteria for discarding a data point which you feel may be an untrustworthy outlier?

Solutions

Expert Solution

Here' the answer to the question. Please let me know in case you've answers.

Criteria for discarding that you may conisider an untrustworthy outlier:

1. Outliers with considerable leavarage can indicate a problem with the measurement or the data recording, communication or whatever. In *such* cases it is absolutely recommended to remove these values.

2. Reasons external to the data should also be considered for removing outlier. For example, weight of a person is 1000kg, which is impossible hence this outlier is untrustworthy. Another example can be a human' height of 450 cm , which is again an impossible value for a height of person. In both case we would have removed these outliers from our data set.

3. Other analytical method to discard the outlier is to remove from a range. The range is basically defined by the formula: Q1-1.5*IQR to Q3 + 1.5*IQR, where Inter Quartile Range , IQR = Q3-Q1

Otherwise another definition of outlier can be upper 2.5% and lower 2.5% of the values.


Related Solutions

What should be the role of government in the economy? (You may feel it should be...
What should be the role of government in the economy? (You may feel it should be less, more, or it could be just about right). Defend your position with a compelling economic argument. Does government fiscal policy work (taxes, spending)? Why or why not? Are there any fiscal policies (taxes, spending) that you support? Why or why not? Thinking about supply & demand, is government intervention required to close the recessionary gap or the inflationary gap? Why or why not?
Pick any two variables that you feel may be related and estimate what you think the...
Pick any two variables that you feel may be related and estimate what you think the strength of the correlation coefficient would be for those two variables. In your response, estimate the value of r. For example, specify a strong (.7 to .9), medium (.4 to .6), or low (0 to .3) value for r. The value of the coefficient can be positive or negative. For example, consider an increase in police patrols in a neighborhood and the number of...
According to your point of view, which of the defined criteria of measurement is considered as...
According to your point of view, which of the defined criteria of measurement is considered as most important for evaluation and monitoring of information systems? How does computer-aided software engineering (CASE) aid in analyzing and evaluating characteristics of a project?
-what you feel is the present state of motivation in the organization. -what changes you feel...
-what you feel is the present state of motivation in the organization. -what changes you feel must be made, if any, to significantly improve performance in the organization.
What are the criteria that one can use to compare different point estimators? what are the...
What are the criteria that one can use to compare different point estimators? what are the properties that a good estimator should have?
What is data analytics? Why do you feel there has been such a emphasis on these...
What is data analytics? Why do you feel there has been such a emphasis on these skills in corporate America in recent years? Identify and provide an example of how data analytics can be used in your business industry.
What is good data? What is meant by bad data? A term that you may have...
What is good data? What is meant by bad data? A term that you may have already encountered is "GIGO". This term refers to Garbage In, Garbage Out. In other words, if incorrect/bad data is entered into a database, the same useless data will be extracted. This results in poor decisions, lost revenue, and unhappy customers. Have you ever been the victim of bad data? Discuss the importance of queries and good/bad data as they relate to database reports. Describe...
What is the difference between a suspicious data point and an extreme data point?
What is the difference between a suspicious data point and an extreme data point?
What term is used to describe the point in data analysis at which nothing new is...
What term is used to describe the point in data analysis at which nothing new is being revealed? Select one: a. Verbatim b. Theme c. Saturation d. Reflexivity e. Grounded theory
What are your criteria when searching for a job? how do you know that these criteria...
What are your criteria when searching for a job? how do you know that these criteria will be fulfilled? How satisfied or not have you been from the recruitment process you experienced? What would you have suggested to make it better? What is for YOU flexibility and how do you evaluate its consequences? How do you make flexibility work? What is central for YOU while you get rewarded? What sort of reward is suiting you and when?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT