Question

In: Computer Science

Apply PCA ( Principal Component Analysis ) in python to this data set below  that is a...

Apply PCA ( Principal Component Analysis ) in python to this data set below  that is a csv file

Then plot it with different colors. Thank you I will UPVOTE!

target A B C D E F G
surprise 2 3 1 1 19 12 0
sad 2 0 0 2 12 1 15
angry 95 2 1 0 1 0 1
sad 4 56 2 0 0 3 1
neutral 1 2 2 0 39 0 11
happy 0 0 0 34 1 0 0
neutral 5 55 0 0 0 2 1
sad 0 33 3 0 0 12 1
happy 0 5 2 0 18 15 2
angry 0 0 0 19 37 0 0
happy 0 1 0 68 17 2 0

Solutions

Expert Solution

Find the answer below

NOTE:- I have done the computation for 3 Principal components.

#Importing modules
import numpy as np
import pandas as pd
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt

# Importing data
data = pd.read_csv('data.csv',sep='\t')
data.head()

#converting data into array
data_arr = data.to_numpy()

#removing the targets
final_data = np.delete(data_arr,0,1)

#Computing the 3 principal components
pca = PCA(n_components=3)
principalComponents = pca.fit_transform(final_data)

#setting up 3 components on the dataframe
principalDf = pd.DataFrame(data=(principalComponents), columns=['PC1', 'PC2','PC3'])

#setting up for plotting with targets
rarr = principalDf['PC1']
carr = principalDf['PC2']
tarr = principalDf['PC3']
target = data_arr[:,0]

#Plotting the principal components
plt.scatter(target,carr)
plt.scatter(target,rarr)
plt.scatter(target,tarr)
plt.legend("123")

Outputs:

Code Snippet attached below

NOTE- If you want to compute a different number of components do change in the In [5] with n_components = '<you want>'

Thanks


Related Solutions

Apply PCA ( Principal Component Analysis ) in python to this data set below  that is a...
Apply PCA ( Principal Component Analysis ) in python to this data set below  that is a csv file Then plot it. Thank you I will UPVOTE! A B C D E F G 2 3 1 1 19 12 0 2 0 0 2 12 1 15 95 2 1 0 1 0 1 4 56 2 0 0 3 1 1 2 2 0 39 0 11 0 0 0 34 1 0 0 5 55 0 0 0...
I already ran PCA on data given in last Principal Component Analysis: Energy_kcal, Protein_g, Fat_g, Carb_g...
I already ran PCA on data given in last Principal Component Analysis: Energy_kcal, Protein_g, Fat_g, Carb_g Eigenanalysis of the Correlation Matrix Eigenvalue 2.2504 1.1894 0.5583 0.0018 Proportion 0.563 0.297 0.140 0.000 Cumulative 0.563 0.860 1.000 1.000 Variable PC1 PC2 PC3 PC4 Energy_kcal 0.663 0.090 -0.028 -0.743 Protein_g 0.399 -0.578 0.663 0.261 Fat_g 0.604 0.027 -0.563 0.564 Carb_g 0.191 0.811 0.494 0.250 I ned answers to these two parts State principal components as linear combination of given set of variables. Explain...
Are there any limitations to using PCA?Principal Components Analysis (PCA)
Are there any limitations to using PCA?Principal Components Analysis (PCA)
What are two benefits of using Principal Components Analysis (PCA)?
What are two benefits of using Principal Components Analysis (PCA)?
What are cross loadings in principal component analysis? What is an unrotated vs a rotated factor...
What are cross loadings in principal component analysis? What is an unrotated vs a rotated factor loading?
What are cross loadings in principal component analysis? What is an unrotated vs a rotated factor...
What are cross loadings in principal component analysis? What is an unrotated vs a rotated factor loading?
1a)state the properties of characteristics roots 1b)state the features of principal component analysis 1c)what are the...
1a)state the properties of characteristics roots 1b)state the features of principal component analysis 1c)what are the aims of principal component analysis 1d)given the vector X1 and X2 from a multivariate normal population, define the conditional distribution of X1
(20 pts) Use the “Distance.sav” (SPSS) data set (located below) to perform a linear regression analysis....
(20 pts) Use the “Distance.sav” (SPSS) data set (located below) to perform a linear regression analysis. This dataset shows how far on average a person in Illinois drives each year. Write your findings using the format presented in the class slides. (2 pts) How much of the variation in the dependent variable is explained by the variation in the independent variable? What statistic did you use? (2 pts) Is the linear model significantly different than zero? Why or why not?...
What are the principal aspects of data that need to be examined when using multivariate analysis?
What are the principal aspects of data that need to be examined when using multivariate analysis?
Apply the classification algorithm to the following set of data records. Draw a decision tree. The...
Apply the classification algorithm to the following set of data records. Draw a decision tree. The class attribute is Repeat Customer. RID Age City Gender Education Repeat Customer 101 20..30 NY F College YES 102 20..30 SF M Graduate YES 103 31..40 NY F College YES 104 51..60 NY F College NO 105 31..40 LA M High school NO 106 41..50 NY F College YES 107 41..50 NY F Graduate YES 108 20..30 LA M College YES 109 20..30 NY...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT