Question

In: Computer Science

Use the confusion matrixes to answer the questions below. Record your answers in a word document....

Use the confusion matrixes to answer the questions below. Record your answers in a word document. Explain your responses and include screen shots where requested. A 5000 row dataset was used to build a model to predict if a person would accept a marketing offer for a personal loan (Personal Loan = Yes). It was partitioned into Training, Validation, and Test with the model results above.

If you were to decrease the cutoff to 0.3 how would that impact your FP and FN counts. Would they increase or decrease? Would your Model Cost increase or decrease?

Solutions

Expert Solution

All code with sample data:

# creating dataset of 7 features 
import random
import pandas as pd
import numpy as np

n_rows = 5000     # number of rows
n_feat = 7        # number of features
features = list()

for loop1 in range(n_feat):
    feat = list()
    target = list()
    for loop2 in range(n_rows):
        feat.append(random.randint(-2,2))
        target.append(random.randint(0,1)) # Personal Loan = Yes is 1 else 0
    features.append(feat)
    

# creating a dataframe of features
data = pd.DataFrame()
for loop3 in range(len(features)):
    data['feat_'+str(loop3+1)] = features[loop3]
data['target'] = target

data.head()

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(data.iloc[:,:7], data.iloc[:,7:], test_size = 0.20)

# converting into array
X_train = np.array(X_train)   
X_test = np.array(X_test)   
y_train = (y_train.values).ravel()
y_test = (y_test.values).ravel()
type(X_train)

# Apply logistic regression
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix

clf = LogisticRegression(random_state=0).fit(X_train, y_train)
y_pred = clf.predict(X_test)

# confusion matrix
cm = confusion_matrix(y_test, y_pred)
print(cm)
print('False Positive: ',cm[0][1])
print('False Negative: ',cm[1][0])

# Now, with threshold
threshold = 0.3
y = clf.predict_proba(X_test)

#defining own threshhold for prediction over probability values
def pred(y):
    y_pred = []
    i=[]
    for i in y:
        if i.any() >= threshold:
            y_pred.append(1)
        else:
            y_pred.append(0)
    return(y_pred)

y_pred = pred(y)
cm = confusion_matrix(y_test, y_pred)
cm
print('False Positive: ',cm[0][1])
print('False Negative: ',cm[1][0])

Results:

Data:

with threshold 0.5:

with threhold 0.3:

FP has increased and FN has decresed.


Related Solutions

Review the following scenario, and answer the questions. Create a Word document for your answers, submit...
Review the following scenario, and answer the questions. Create a Word document for your answers, submit via submission link. A 28-year-old primigravida at 41 weeks’ gestation is admitted to the L&D unit for early labor at 2 cm, 70% effaced, and 0 station. How can the nurse best describe to this patient the latent phase of labor? How will the cardinal movements of labor facilitate the birth of the fetus?
M5 Assignment Questions I through M Use this Word document to fill in the answers to...
M5 Assignment Questions I through M Use this Word document to fill in the answers to the questions. You must type out a clear answer to each question, even if the answer is also contained in the Excel file submitted to show work. Q1) Independent-Samples t Test (15 points total) In a cognitive psychology experiment, the researcher is interested in whether encoding condition has a significant effect on memory for a set of drawings. In encoding condition A, subjects are...
M5 Assignment Questions E through H Use this Word document to fill in the answers to...
M5 Assignment Questions E through H Use this Word document to fill in the answers to the questions. You must type out a clear answer to each question, even if the answer is also contained in the Excel file submitted to show work. Q1) Independent-Samples t Test (15 points total) In a cognitive psychology experiment, the researcher is interested in whether encoding condition has a significant effect on memory for a set of drawings. In encoding condition A, subjects are...
Answer the following questions based . Write your response in a separate Microsoft Word document: o...
Answer the following questions based . Write your response in a separate Microsoft Word document: o Importance of Cost of Capital: THE ANSWER SHOULD BE BASED ON AMAZON Why is cost of capital important to an organization, and what does it measure? o Meaning of Calculations: How do organizations calculate various costs, and what do these calculations mean to business?
Not handwriting answer, please In a word document, please answer the following questions: Course Introduction of...
Not handwriting answer, please In a word document, please answer the following questions: Course Introduction of Biostatistics Define the following terms: correlation coefficient scatter plot bivariate relationship Provide an example where the outlier is more important to the research than the other observations? Identify when to use Spearman’s rho.
Read the assignment and answer the questions. You must also upload your Excel file, Word document...
Read the assignment and answer the questions. You must also upload your Excel file, Word document or pdf showing your work. Bodacious Building Co. is considering four different acquisition methods for obtaining pickup trucks. If the contractor’s MARR is 6%, which alternative do you recommend? The alternatives are: Immediate cash purchase of the trucks for $22,500 each, and after five years sell each truck for an estimated $4,900. Lease the trucks for five years for $4,000 per year paid at...
Fully answer the questions associated with each case below and upload your completed document in the...
Fully answer the questions associated with each case below and upload your completed document in the dropbox below.There are 6 cases, with 10 questions spread among them. Each question is worth 10 points for a total of 100 points. Secnario: You are a paralegal with the Weyland-Yutani Corporation. Your boss attorney, Sharon Ripley, has asked you do answer some questions about some HR legal issues that have arisen. Case 1 The first case involves Joe Stromboli. Joe is a delivery...
Finally, create a Word or text document (300-500 words) that answers the following questions: Describe the...
Finally, create a Word or text document (300-500 words) that answers the following questions: Describe the kinds of services and applications you use in your daily life that might benefit from being hosted in the cloud? What are some of the benefits that would come from being cloud-based? Some of these might already be in-fact running in the cloud. Which ones do you think already are?
Interpreting ANOVA (Please use the attached document, bold the questions, and include your answers in non-bolded...
Interpreting ANOVA (Please use the attached document, bold the questions, and include your answers in non-bolded font). An Industrial Organizational psychologist is interested in examining the relative effectiveness of three leadership styles on worker productivity. A sample of N = 15 assembly line workers is obtained. These individuals are randomly assignment to each of the three leadership conditions: Authoritarian, Democratic, and Delegative. The number of units workers produced in a 10-hour shift is recorded. The data are as follows: ANOVA...
Answer the following questions and use Excel or this document to show your work. 1. Consider...
Answer the following questions and use Excel or this document to show your work. 1. Consider the following results for two samples randomly taken from two normal populations with equal variances. Sample I Sample II Sample Size 28 35 Sample Mean 48 44 Population Standard Deviation 9 10 a. Develop a 95% confidence interval for the difference between the two population means. b. Is there conclusive evidence that one population has a larger mean? Explain.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT