In: Statistics and Probability
write three benefits from doing the principle component analysis?
The principal component analysis is a data reduction technique.we can reduce the number of variables.
1) The very high dimensional nature of many data sets makes direct visualization impossible as we humans can only comprehend three dimensions. The solution is to work with data dimension reduction techniques PCA.It Removes Correlated Features.
2) When reducing the dimensions of data, it’s important not to lose more information than is necessary. The variation in a data set can be seen as representing the information that we would like to keep. PCA is a well-established mathematical technique for reducing the dimensionality of data while keeping as much variation as possible.
3) PCA is an unsupervised method, meaning that no information about groups is used in dimension reduction. This means that PCA shows a visual representation of the dominant patterns in a data set.
4) PCA provides a synchronized low-dimensional representation of the variables. The synchronized sample and variable representations provide a way to visually find variables that are characteristic of a group of samples
5) It improves the algorithm performance. With so many features, the performance of your algorithm will drastically degrade. PCA is a very common way to speed up your Machine Learning algorithm by getting rid of correlated variables that don't contribute to any decision making. The training time of the algorithms reduces significantly with less number of features.
6)Overfitting mainly occurs when there are too many variables in the dataset. So, PCA helps in overcoming the overfitting issue by reducing the number of features.