Question

In: Computer Science

Question 2 Consider the one-dimensional data set shown below. x 0.5 3.0 4.5 4.6 4.9 5.2...

Question 2 Consider the one-dimensional data set shown below.

x	0.5	3.0	4.5	4.6	4.9	5.2	5.3	5.5	7.0	9.5
y	-	-	+	+	+	-	-	+	-	-

Classify the data point x = 5.0 according to its 1st, 3rd, 5th, and 9th nearest neighbors using K-nearest neighbor classifier.

Question 3 Use data set mushrooms.csv available for developing supervised model. The data set contains two classes namely,edible and poisonous. Perform following analysis on the data set.

Data Set:

1. Understand distribution of classes in the data set using suitable plots.

2. Develop supervised models: Decision tree and k-nearest neighbor

3. Identify best k in Cross-validation evaluating method for supervised models in step 3.

4. Discuss results achieved by each supervised model using confusion matrix, sensitivity, specificity,accuracy, F1-score and ROC curve.

5. Provide your opinion on why there exist variation in performance by models.

Expert Solution

Answer 2:

1-nearest neighbor: +

3-nearest neighbor: −

5-nearest neighbor: +

9-nearest neighbor: −

Answer 3:

(1)

Distribution of classes

In [20]:

print(classification_report(y_test, y_pred))

             precision    recall  f1-score   support

          0       0.85      0.97      0.91      1257
          1       0.97      0.82      0.89      1181

avg / total       0.91      0.90      0.90      2438

2(i) Decision Tree Model

In: from sklearn.tree import DecisionTreeClassifier as DT

classifier = DT(criterion='entropy',random_state=42)
classifier.fit(X_train,y_train)

Out: DecisionTreeClassifier(class_weight=None, criterion='entropy', max_depth=None,
            max_features=None, max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, presort=False, random_state=42,
            splitter='best')

(ii)k-nearest neighbor model

In: from sklearn.neighbors import KNeighborsClassifier as KNN

classifier = KNN()
classifier.fit(X_train,y_train)

Out:

KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=1, n_neighbors=5, p=2,
           weights='uniform')

(4)

Decision Tree Results

In:

print_score(classifier,X_train,y_train,X_test,y_test,train=False)

Test results:

Accuracy Score: 0.9007

Classification Report:
             precision    recall  f1-score   support

          0       0.90      0.91      0.90      1257
          1       0.91      0.89      0.90      1181

avg / total       0.90      0.90      0.90      2438


Confusion Matrix:
[[1147  110]
 [ 132 1049]]

K-NN Test Results

In :

print_score(classifier,X_train,y_train,X_test,y_test,train=False)

Test results:

Accuracy Score: 0.9307

Classification Report:
             precision    recall  f1-score   support

          0       0.91      0.96      0.93      1257
          1       0.96      0.90      0.93      1181

avg / total       0.93      0.93      0.93      2438


Confusion Matrix:
[[1211   46]
 [ 123 1058]]

(5) Maybe the most well-known reason is that you have overfit the training data. You have hit upon a model, a lot of model hyperparameters, a perspective on the information, or a mix of these components and more that just so happens to give a decent ability gauge on the training dataset..

venereology answered 5 months ago

Consider the following data on x and y shown in the table below, x 2 4...

Consider the following data on x and y shown in the table below, x 2 4 7 10 12 15 18 20 21 25 y 5 10 12 22 25 27 39 50 47 65 Fit the model E(y)=β0+β1x to the data, and plot the residuals versus x for the model on Minitab. Do you detect any trends? If so, what does the pattern suggest about the model?

Use the data set below to answer the question. x −2 −1 0 1 2 y...

Use the data set below to answer the question. x −2 −1 0 1 2 y 2 2 4 5 5 Find a 90% prediction interval for some value of y to be observed in the future when x = −1. (Round your answers to three decimal places.)

Question Consider the function ?(?) = ??2 and x = 0, 0.25, 0.5, 1. Then use...

Question Consider the function ?(?) = ??2 and x = 0, 0.25, 0.5, 1. Then use the suitable Newton interpolating polynomial to approximate f(0.75). Also, compute an error bound for your approximation Dont use a sheet to solve thanks numerical methods

For the data set shown below, complete parts (a) through (d) below. X 3 4 5...

For the data set shown below, complete parts (a) through (d) below. X 3 4 5 7 8 Y 3 5 8 12 13 (a) Find the estimates of Bo and B1. Bo=bo= _____ (Round to three decimal places as needed.) B1=b1= ______(Round to four decimal places as needed.) (b) Compute the standard error the point estimate for se= ____ (c) Assuming the residuals are normally distributed, determine Sb1=____ (Round to four decimal places as needed.) (d) Assuming the residuals...

For the data set shown below, complete parts (a) through (d) below. X 20 30 40...

For the data set shown below, complete parts (a) through (d) below. X 20 30 40 50 60 Y 98 95 91 81 68 (a) Find the estimates of Bo and B1. Bo=bo= _____ (Round to three decimal places as needed.) B1=b1= ______(Round to four decimal places as needed.) (b) Compute the standard error the point estimate for se= ____ (c) Assuming the residuals are normally distributed, determine Sb1=____ (Round to four decimal places as needed.) (d) Assuming the residuals...

For the data set shown below, complete parts (a) through (d) below. X 3 4 5...

For the data set shown below, complete parts (a) through (d) below. X 3 4 5 7 8 Y 4 7 6 12 15 (a) Find the estimates of Bo and B1. Bo=bo= _____ (Round to three decimal places as needed.) B1=b1= ______(Round to four decimal places as needed.) (b) Compute the standard error the point estimate for se= ____ (c) Assuming the residuals are normally distributed, determine Sb1=____ (Round to four decimal places as needed.) (d) Assuming the residuals...

For the data set shown below, complete parts (a) through (d) below. x y 20 102...

For the data set shown below, complete parts (a) through (d) below. x y 20 102 30 95 40 91 50 81 60 68 (a) Use technology to find the estimates of beta 0 and beta 1. beta 0 ~ b 0=_____(Round to two decimal places as needed.) beta 1 ~ b 1=_____(Round to two decimal places as needed.) (b) Use technology to compute the standard error, the point estimate for o' (o with a little tag on the top)...

the data set shown below, complete parts (a) through (d) below. x 3 4 5 7...

the data set shown below, complete parts (a) through (d) below. x 3 4 5 7 8 y 5 7 8 12 13 (a) Find the estimates of beta 0 and beta 1. beta 0almost equalsb 0equals nothing (Round to three decimal places as needed.) beta 1almost equalsb 1equals nothing (Round to three decimal places as needed.)

the data set shown below, complete parts (a) through (d) below. x 3 4 5 7...

the data set shown below, complete parts (a) through (d) below. x 3 4 5 7 8 y 5 7 6 12 13 (a) Find the estimates of beta 0 and beta 1. beta 0almost equalsb 0equals nothing (Round to three decimal places as needed.) beta 1almost equalsb 1equals nothing (Round to three decimal places as needed.)(a) Find the estimates of beta 0 and beta 1. beta 0almost equalsb 0equals ??(Round to three decimal places as needed.) beta 1almost equalsb...

For the data set shown below, complete parts (a) through (d). X Y 3 4 4...

For the data set shown below, complete parts (a) through (d). X Y 3 4 4 7 5 6 7 12 8 15 (a) Find the estimates of Bo and B1. Bo=bo= _____ (Round to three decimal places as needed.) B1=b1= ______(Round to four decimal places as needed.) (b) Compute the standard error the point estimate for se= ____ (c) Assuming the residuals are normally distributed, determine Sb1=____ (Round to four decimal places as needed.) (d) Assuming the residuals are...

Question

Question 2 Consider the one-dimensional data set shown below. x 0.5 3.0 4.5 4.6 4.9 5.2...

Solutions

Expert Solution

Related Solutions

Consider the following data on x and y shown in the table below, x 2 4...

Use the data set below to answer the question. x −2 −1 0 1 2 y...

Question Consider the function ?(?) = ??2 and x = 0, 0.25, 0.5, 1. Then use...

For the data set shown below, complete parts (a) through (d) below. X 3 4 5...

For the data set shown below, complete parts (a) through (d) below. X 20 30 40...

For the data set shown below, complete parts (a) through (d) below. X 3 4 5...

For the data set shown below, complete parts (a) through (d) below. x y 20 102...

the data set shown below, complete parts (a) through (d) below. x 3 4 5 7...

the data set shown below, complete parts (a) through (d) below. x 3 4 5 7...

For the data set shown below, complete parts (a) through (d). X Y 3 4 4...