In: Advanced Math
Need the math explanation
1. The value of a weight vector is given as (w1=3, w2=-2, w0=1) for a linear model with soft threshold (sigmoid) function f(x). Define a decision boundary, where the values of the feature vector x result in f(x)=0.5. Plot the decision boundary in two dimensions.
2. Generating training samples: In two dimensional feature space x: (x1, x2,1), generate 20 random samples, for different values of (x1,x2), that belong to two different classes C1 (1) and C2 (0). The label of each feature vector is assigned so that the samples are linearly separable, i.e., can be separated by a linear model with a soft threshold (sigmoid) function. Plot the samples you generate in a two dimensional plane of (x1,x2). Hint: You may construct an underlying linear model to cut the plane in two halves. Then generate random samples at either side with proper labels.
3. Construct a quadratic error function using a learn model with a soft threshold (sigmoid) function for augmented feature vectors in n+1 dimensions. Derive a gradient decent algorithm for learning the weights. Write a program using either Matlab or Python to learn the weights using the training samples you generate from Prob. 2. Plot the resulting decision boundary.
4. Consider a linear combination of three radial basis functions. Draw a network structure for the model. Write a (pseudo) algorithm for learning the parameters of the model. (You determine what error function to use, what training samples to use, and write iterative equations for learning the parameters.)
Please show how you got to answer!
1.
We consider a linear sample between -5 to 5 consisting of 100 hunderd sample sample to plot the sigmoid function and its decision boundary.
In python, a sigmoid function can be written as
def sigmoid(z):return 1/(1 + np.exp(-z))
we plot the sigmoid function and decision boundary with the following code:
import matplotlib.pyplot as plt
import numpy as np
x_sample = np.linspace(-5,5,100)
plt.plot(x_sample, sigmoid(x_sample))
plt.plot([-5,5],[.5,.5])
plt.xlabel("z")
plt.ylabel("sigmoid(z)")
plt.show()
The 2D graph of sigmoid function and decision boundary look like:
2.) A random of 20 sample (x1, x2, 1) can be genrated with the following python code:
import random
x = [[random.random() for i in range(2)] for j in range(20)]
for i in range(20):
x[i].extend([1])
we get a 2D list of shape (3, 20):
(x1,x2) can be plotted in a 2D graph, with the following code:
x1, x2 = [],[]
for i in range(20):
x1.append(x[i][0])
x2.append(x[i][1])
plt.plot(x1, x2, 'ro')
plt.show()
we obtain the following graph
let's draw a decision boundary along the line x2 = -5*x1 +3. This line can be plotted with the following code:
plt.plot([0.4,0.6], [1.0,0.0])
Let all the points to the left of the line be labelled 0, and all the points to the right of the line be labelled 1. This can be accomplished with the following python code:
y = np.zeros(20)
for n,i in enumerate(x):
if i[1]+(5*i[0]) - 3 > 0: y[n] = 1
else: y[n] = 0
print(y)
As a result, we get a array of 0s and 1s:
4.) Classification problem
.) now we define a linear model consisting of three radial basis activation:
The radial basis activation can be defined in Python as follows;
def rbf(x, c):
s = st.stdev(x)
return np.exp(-1 / (2 * s**2) * (x-c)**2)
Our model is going to have 3 linear layers, 3 rbf activation layers and 1 sigmoid activation layer. Therefore it requires 3 sets of weights for 3 linear layers and three centers for 3 radial basis activations
def model(x, w1, w2, w3, c1, c2, c3):
l1 = lin(x, w1)
l2 = rbf(l1, c1)
l3 = lin(l2, w2)
l4 = rbf(l3, c2)
l5 = lin(l4, w3)
l6 = rbf(l5,c3)
l7 = sigmoid(l6)
return l7
the gradient descent function for learning parameters will be same as before but with slight modifications
def gradient_descent(pred, target, features, w1 , w2, w3,
lr):
n = len(features)
grad = np.dot(features.T, pred-target)
grad.astype(float)
grad /= n
grad *= lr
w1 -= grad
w2 -= grad
w3 -= grad
return w1,w2,w3
The train function must undergo slight modification too:
def train(features, target, w1, w2, w3, c1, c2, c3, lr,
iters):
cost_history = []
for i in range(iters):
pred = model(features, w1, w1, w3, c1, c2, c3)
cost = mse(preds, target)
cost_history.append(cost)
w1,w2,w3 = gradient_descent(pred, target,features, w1,w2,w3,
lr)
return w1,w2,w3, cost_history