Question

In: Computer Science

In the exer- cise, you will implement logistic regression algorithm using SGA, similar to the logistic...

In the exer- cise, you will implement logistic regression algorithm using SGA, similar to the logistic regression algorithm that you have seen in class. You will work with the datasets attached to the assignment and complete the lo- gisticRegression.py file to learn the coefficients and predict binary class labels. The data comes from breast cancer diagnosis where each sample (30 features) is labeled by a diagnose: either M (malignant) or B (be- nign) (recorded in the 31-st column in the datasets). Read the main code, check the configuration parameters, and make sure the data is loaded and augmented correctly. Do not use logistic regression packages.
(a) [15 pts.] Complete the function predict(x, w), gradient(x, y, w), and cross entropy(y hat, y) functions according to the instructions in lo- gisticRegression.py. These functions will be used in the main SGA algorithm (logisticRegression SGA).
(b) [15 pts.] Complete the logisticRegression SGA(X, y, psi, epsilon, epochs) function. In class, we used a stopping criterion for repeat loop in the SGA algorithm for logistic regression: the loop contin- ued until the norm of difference between w’s in consecutive itera- tions were less than a predefined number ε (line 11 of the algorithm:∥w ̃t − w ̃t−1∥ ≤ ε). Here, in addition to this criterion, we would like to limit the number of epochs (iterations over the whole dataset, or t (step/iteration number) in the algorithms in the slides) to a pre- defined number (max epochs). In the main function, max epochs is initialized to 8.
(c) [15 pts.] Complete the rest of the main code to use the learned w to predict class labels for test datapoints (that has not been used for learning the w’s) and to calculate and print the average cross-entropy error for training and testing data.
(d) [15 pts.] Run the code on the cancer dataset with different psi and epsilon values. Check the change in cross-entropy values across itera- tions (in the plot) and the average training and testing cross-entropy errors. What do you observe about the losses and number of itera- tions? What do you conclude?

Solutions

Expert Solution

coefficients = np.array(initial_coefficients)
np.random.seed(seed=1)
permutation = np.random.permutation(len(feature_matrix))
feature_matrix = feature_matrix[permutation,:]
sentiment = sentiment[permutation]
i = 0 
for itr in xrange(max_iter):
   predictions = predict_probability(feature_matrix[i:(i+batch_size),:], coefficients)
   indicator = (sentiment[i:i+batch_size]==+1)
   errors = indicator - predictions

for j in xrange(len(coefficients)): 
   derivative = feature_derivative(errors, feature_matrix[i:i+batch_size,j])
   coefficients[j] += step_size * derivative * 1. / batch_size


lp = compute_avg_log_likelihood(feature_matrix[i:i+batch_size,:], sentiment[i:i+batch_size],
coefficients)
log_likelihood_all.append(lp)
if itr <= 15 or (itr <= 1000 and itr % 100 == 0) or (itr <= 10000 and itr % 1000 == 0) \
or itr % 10000 == 0 or itr == max_iter-1:
   data_size = len(feature_matrix)
   print 'Iteration %*d: Average log likelihood (of data points [%0*d:%0*d]) = %.8f' % \
   (int(np.ceil(np.log10(max_iter))), itr, \
    int(np.ceil(np.log10(data_size))), i, \
    int(np.ceil(np.log10(data_size))), i+batch_size, lp) 

i += batch_size
if i+batch_size > len(feature_matrix):
   permutation = np.random.permutation(len(feature_matrix))
feature_matrix = feature_matrix[permutation,:]
sentiment = sentiment[permutation]
i = 0
return coefficients, log_likelihood_all

sample_feature_matrix = np.array([[1.,2.,-1.], [1.,0.,1.]])
sample_sentiment = np.array([+1, -1])
coefficients, log_likelihood = logistic_regression_SG(sample_feature_matrix, sample_sentiment, np.zeros(3), step_si
###

Now run batch gradient ascent over the feature_matrix_train for 200 iterations using:

initial_coefficients = np.zeros(194)

step_size = 5e-1

batch_size = len(feature_matrix_train)

max_iter = 200

coefficients_batch, log_likelihood_batch = logistic_regression_SG(feature_matrix_train, sentiment_train,\

initial_coefficients=np.zeros(194),\

step_size=5e-1, batch_size=len(feature_matrix_train), max_iter=200)

Iteration 0: Average log likelihood (of data points [00000:47780]) = -0.68308119

Iteration 1: Average log likelihood (of data points [00000:47780]) = -0.67394599

Iteration 2: Aver
age log likelihood (of data points [00000:47780]) = -0.66555129

Iteration 3: Average log likelihood (of data points [00000:47780]) = -0.65779626

Iteration 4: Average log likelihood (of data points [00000:47780]) = -0.65060701

Iteration 5: Average log likelihood (of data points [00000:47780]) = -0.64392241

Iteration 6: Average log likelihood (of data points [00000:47780]) = -0.63769009

Iteration 7: Average log likelihood (of data points [00000:47780]) = -0.63186462

Iteration 8: Average log likelihood (of data points [00000:47780]) = -0.62640636

Iteration 9: Average log likelihood (of data points [00000:47780]) = -0.62128063

Iteration 10: Average log likelihood (of data points [00000:47780]) = -0.61645691

Iteration 11: Average log likelihood (of data points [00000:47780]) = -0.61190832

Iteration 12: Average log likelihood (of data points [00000:47780]) = -0.60761103

Iteration 13: Average log likelihood (of data points [00000:47780]) = -0.60354390

Iteration 14: Average log likelihood (of data points [00000:47780]) = -0.59968811

Iteration 15: Average log likelihood (of data points [00000:47780]) = -0.59602682

Iteration 100: Average log likelihood (of data points [00000:47780]) = -0.49520194

Iteration 199: Average log likelihood (of data points [00000:47780]) = -0.47126953

plt.plot(log_likelihood_batch)

plt.show()

Related Solutions

Stochastic Gradient Ascent (SGA) for Logistic Regression. In the exer- cise, you will implement logistic regression...
Stochastic Gradient Ascent (SGA) for Logistic Regression. In the exer- cise, you will implement logistic regression algorithm using SGA, similar to the logistic regression algorithm that you have seen in class. You will work with the datasets attached to the assignment and complete the lo- gisticRegression.py file to learn the coefficients and predict binary class labels. The data comes from breast cancer diagnosis where each sample (30 features) is labeled by a diagnose: either M (malignant) or B (be- nign)...
Stochastic Gradient Ascent (SGA) for Logistic Regression. In the exer- cise, you will implement logistic regression...
Stochastic Gradient Ascent (SGA) for Logistic Regression. In the exer- cise, you will implement logistic regression algorithm using SGA, similar to the logistic regression algorithm that you have seen in class. You will work with the datasets attached to the assignment and complete the lo- gisticRegression.py file to learn the coefficients and predict binary class labels. The data comes from breast cancer diagnosis where each sample (30 features) is labeled by a diagnose: either M (malignant) or B (be- nign)...
Logistic Regression and the Cross Entropy Cost. What are they similar in? Give me please some...
Logistic Regression and the Cross Entropy Cost. What are they similar in? Give me please some explanation or examples, that shows their similarity
Logistic Regression In logistic regression we are interested in determining the outcome of a categorical variable....
Logistic Regression In logistic regression we are interested in determining the outcome of a categorical variable. In most cases, we deal with binomial logistic regression with the binary response variable, for example yes/no, passed/failed, true/false, and others. Recall that logistic regression can be applied to classification problems when we want to determine a class of an event based on the values of its features.    In this assignment we will use the heart data located at   http://archive.ics.uci.edu/ml/datasets/Statlog+%28Heart%29 Here is the...
Discuss the applications of Binary Logistic Regression in Clinical Research using the case study given in the(Application of Binary Logistic Regression in Clinical Research)
  Discuss the applications of Binary Logistic Regression in Clinical Research using the case study given in the(Application of Binary Logistic Regression in Clinical Research) in a brief manner with a maximum length of two pages  
Develop an algorithm and implement a Preemptive Priority scheduling algorithm using C++ and using bubble sorting...
Develop an algorithm and implement a Preemptive Priority scheduling algorithm using C++ and using bubble sorting .Arrival time, burst time and priority values.The code should also display: (i) Gantt chart and determine the following: (ii) Determine the Turnaround time(TAT), waiting time(WT) of each process (iii) Determine the Average Waiting Time (AWT) and Average Turnaround Time (ATAT) of all processes. please write the comments
When should logistic regression be used for data analysis? What is the assumption of logistic regression?...
When should logistic regression be used for data analysis? What is the assumption of logistic regression? How to explain odds ratio?
define the logistic regression model.
define the logistic regression model.
Develop the best logistic regression model that can predict the wage by using the combination of...
Develop the best logistic regression model that can predict the wage by using the combination of any following variables: total unit (X2), constructed unit (X3), equipment used (X4), city location (X5) and total cost of a project (X6). Make sure that you partition your data with 60% training test, 40% validation test, and default seed of 12345 before running the logistic regression (15 points) Wage - X1 Total Unit - X2 Contracted Units - X3 Equipment Used - X4 City...
Develop an algorithm and implement Optimal Page Replacement algorithm using C++. Determine the number of page...
Develop an algorithm and implement Optimal Page Replacement algorithm using C++. Determine the number of page faults and page hits by considering the Frame size=4, ReferenceString:2 4 6 7 8 2 4 9 13 9 2 7 2 6 1 4 9 2
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT