In: Statistics and Probability
Why are random forests a good model to use for classification predictions/problems? What are the pros and cons of using random forests?
It is best for classitication problem as it's prediction is based on output of multiple model.
Pros and cons of random forests
Bagged ensemble models have both advantages and disadvantages. The advantages of random forests are here:
The predictive performance can compete with the best supervised learning algorithms
They provide a reliable feature importance estimate and we can see the important features
They offer efficient estimates of the test error without incurring the cost of repeated model training associated with cross-validation
On the other hand, random forests also have a few disadvantages:
An ensemble model is inherently less interpretable than an individual decision tree as it involves many decision tree.
Training a large number of deep trees can have high computational costs (but can be parallelized) and use a lot of memory
Predictions are slower, which may create challenges for applications as a large number of models are trained.