In: Statistics and Probability
What is the criteria for discarding a data point which you feel may be an untrustworthy outlier?
Here' the answer to the question. Please let me know in case you've answers.
Criteria for discarding that you may conisider an untrustworthy outlier:
1. Outliers with considerable leavarage can indicate a problem with the measurement or the data recording, communication or whatever. In *such* cases it is absolutely recommended to remove these values.
2. Reasons external to the data should also be considered for removing outlier. For example, weight of a person is 1000kg, which is impossible hence this outlier is untrustworthy. Another example can be a human' height of 450 cm , which is again an impossible value for a height of person. In both case we would have removed these outliers from our data set.
3. Other analytical method to discard the outlier is to remove from a range. The range is basically defined by the formula: Q1-1.5*IQR to Q3 + 1.5*IQR, where Inter Quartile Range , IQR = Q3-Q1
Otherwise another definition of outlier can be upper 2.5% and lower 2.5% of the values.