Question

In: Advanced Math

Need the math explanation 1. The value of a weight vector is given as (w1=3, w2=-2,...

Need the math explanation

1. The value of a weight vector is given as (w1=3, w2=-2, w0=1) for a linear model with soft threshold (sigmoid) function f(x). Define a decision boundary, where the values of the feature vector x result in f(x)=0.5. Plot the decision boundary in two dimensions.

2. Generating training samples: In two dimensional feature space x: (x1, x2,1), generate 20 random samples, for different values of (x1,x2), that belong to two different classes C1 (1) and C2 (0). The label of each feature vector is assigned so that the samples are linearly separable, i.e., can be separated by a linear model with a soft threshold (sigmoid) function. Plot the samples you generate in a two dimensional plane of (x1,x2). Hint: You may construct an underlying linear model to cut the plane in two halves. Then generate random samples at either side with proper labels.

3. Construct a quadratic error function using a learn model with a soft threshold (sigmoid) function for augmented feature vectors in n+1 dimensions. Derive a gradient decent algorithm for learning the weights. Write a program using either Matlab or Python to learn the weights using the training samples you generate from Prob. 2. Plot the resulting decision boundary.

4. Consider a linear combination of three radial basis functions. Draw a network structure for the model. Write a (pseudo) algorithm for learning the parameters of the model. (You determine what error function to use, what training samples to use, and write iterative equations for learning the parameters.)

Please show how you got to answer!

Solutions

Expert Solution

1.

We consider a linear sample between -5 to 5 consisting of 100 hunderd sample sample to plot the sigmoid function and its decision boundary.

In python, a sigmoid function can be written as

def sigmoid(z):return 1/(1 + np.exp(-z))

we plot the sigmoid function and decision boundary with the following code:

import matplotlib.pyplot as plt

import numpy as np

x_sample = np.linspace(-5,5,100)

plt.plot(x_sample, sigmoid(x_sample))
plt.plot([-5,5],[.5,.5])

plt.xlabel("z")
plt.ylabel("sigmoid(z)")

plt.show()

The 2D graph of sigmoid function and decision boundary look like:

2.) A random of 20 sample (x1, x2, 1) can be genrated with the following python code:

import random
x = [[random.random() for i in range(2)] for j in range(20)]
for i in range(20):
x[i].extend([1])

we get a 2D list of shape (3, 20):

(x1,x2) can be plotted in a 2D graph, with the following code:

x1, x2 = [],[]
for i in range(20):
x1.append(x[i][0])
x2.append(x[i][1])
plt.plot(x1, x2, 'ro')
plt.show()

we obtain the following graph

let's draw a decision boundary along the line x2 = -5*x1 +3. This line can be plotted with the following code:

plt.plot([0.4,0.6], [1.0,0.0])

Let all the points to the left of the line be labelled 0, and all the points to the right of the line be labelled 1. This can be accomplished with the following python code:

y = np.zeros(20)

for n,i in enumerate(x):
if i[1]+(5*i[0]) - 3 > 0: y[n] = 1
else: y[n] = 0

print(y)

As a result, we get a array of 0s and 1s:

4.) Classification problem

.) now we define a linear model consisting of three radial basis activation:

The radial basis activation can be defined in Python as follows;

def rbf(x, c):

s = st.stdev(x)

return np.exp(-1 / (2 * s**2) * (x-c)**2)

Our model is going to have 3 linear layers, 3 rbf activation layers and 1 sigmoid activation layer. Therefore it requires 3 sets of weights for 3 linear layers and three centers for 3 radial basis activations

def model(x, w1, w2, w3, c1, c2, c3):
l1 = lin(x, w1)
l2 = rbf(l1, c1)
l3 = lin(l2, w2)

l4 = rbf(l3, c2)

l5 = lin(l4, w3)

l6 = rbf(l5,c3)

l7 = sigmoid(l6)

return l7

the gradient descent function for learning parameters will be same as before but with slight modifications

def gradient_descent(pred, target, features, w1 , w2, w3, lr):
n = len(features)
grad = np.dot(features.T, pred-target)
grad.astype(float)
grad /= n
grad *= lr
w1 -= grad

w2 -= grad

w3 -= grad
return w1,w2,w3

The train function must undergo slight modification too:

def train(features, target, w1, w2, w3, c1, c2, c3, lr, iters):
cost_history = []
  
for i in range(iters):
pred = model(features, w1, w1, w3, c1, c2, c3)
cost = mse(preds, target)
cost_history.append(cost)
  
  
w1,w2,w3 = gradient_descent(pred, target,features, w1,w2,w3, lr)

return w1,w2,w3, cost_history


Related Solutions

Formula for Problem W1 in year 1; W2 in year 2 only to those retained W1...
Formula for Problem W1 in year 1; W2 in year 2 only to those retained W1 < W2 • Value of E-types applying = W1 + q•W2 + (1–q)•WE • Alternative for E-types = 2•WE • Value of D-types applying = W1 + (1-q)•W2 + q•WD • Alternative for D-types = 2•WD 1. Lorne Roberts Corp. has invested a lot of money in their employee screening process over the last few years for computer technicians. This company can assess whether...
W1, W2 and X are given at the bottom. Design a square column footing for a...
W1, W2 and X are given at the bottom. Design a square column footing for a 18-in. square tied interior column that supports loads of DL (W1) k and Live load LL (W2) k. The column is reinforced with eight No 8 bars, the bottom of the footing is 5 foot below final grade, and the soil weighs 100 lb /ft3 the allowable soil pressure is w ksf. The concrete strength is 4,000 psi and the steel is Grade 60....
A firm’s production function be given by y = x1 + x2 with w1 and w2...
A firm’s production function be given by y = x1 + x2 with w1 and w2 being the price of inputs 1 and 2 respectively. (a) Derive the conditional factor demands. (b) Suppose w1 = 2 and w2 = 1. Find the long-run cost function for this firm. Derive and graph the firm’s long-run supply curve. (c) Suppose the price of x2, w2, increases to $2 per unit. What is the long-run cost curve? Derive and graph the new supply...
2) Consider the cost function: C(w1,w2,q)=min{w1,w2}q Derive the production function and the conditional demand functions of...
2) Consider the cost function: C(w1,w2,q)=min{w1,w2}q Derive the production function and the conditional demand functions of the factors of production. 3) A monopolist firm operates in a market where the inverse demand function is given by P (Q) = 24-2Q. The average unit cost of production of the firm is 2. Calculate The price-quantity pair which maximizes the profit of the monopolist and calculate the elasticity of the demand, the profit of the firm and the dry loss of well-being....
Derive the optimal portfolio weights, {w1,w2, w3} for 3-asset case.Hint: Solve the following constraint...
Derive the optimal portfolio weights, {w1, w2, w3} for 3-asset case.Hint: Solve the following constraint optimization problem:        min σ2p = [w21σ21 + w22σ22 + w23σ23] + 2w1w2σ12 + 2w1w3σ13 + 2w2w3σ23      w1,w2,w3                                                       (l) w1E(r˜1) + w2E(r˜2) + w3E(r˜3) = E(r˜p)                        s.t           (g) w1 + w2 + w3=1Derive the optimal portfolio variance, σ*p2
** No need to answer (1) or (2), Please show explanation for (3), with specific calculation...
** No need to answer (1) or (2), Please show explanation for (3), with specific calculation process if needed. Thank you. ** Dudley has a utility function U(C, R) = CR, where R is leisure and C is consumption per day. He has 16 hours per day to divide between work and leisure. Dudley has a non-labor income of $48 per day. (a) If Dudley is paid a wage of $6 per hour, how many hours of leisure will he...
Question 3. Given your analysis in Questions 1 and 2, and the electoral math in Georgia,...
Question 3. Given your analysis in Questions 1 and 2, and the electoral math in Georgia, is it likely that Democrats will be able to flip Georgia from Republican to Democrat in the upcoming presidential election? What happens if Trump can parlay the improved economic condition of Blacks into a modest improvement in the number of Blacks voting Republican? Explain.
IncorrectQuestion 6 0 / 1 pts Use the following information: Probability w1 w2 w3 state1 0.3333333...
IncorrectQuestion 6 0 / 1 pts Use the following information: Probability w1 w2 w3 state1 0.3333333 15.00% 8.00% 3.00% state2 0.3333333 9.00% 5.00% 8.00% state3 0.3333333 12.00% 7.00% 4.00% % wealth invested 33.3333% 33.3333% 33.3333% What is the expected standard deviation on the portfolio made up of all three assets over the next period?    between .60% and .70%    between .55% and .60%    greater than .80%    less than .55%
IncorrectQuestion 4 0 / 1 pts Use the following information: Probability w1 w2 w3 state1 0.3333333...
IncorrectQuestion 4 0 / 1 pts Use the following information: Probability w1 w2 w3 state1 0.3333333 15.00% 8.00% 3.00% state2 0.3333333 9.00% 5.00% 8.00% state3 0.3333333 12.00% 7.00% 4.00% % wealth invested 33.3333% 33.3333% 33.3333% What is the correlation coefficient involving the first two assets?:    between 75% and 80%    less than 75%    greater than 90%    between 85% and 90%
IncorrectQuestion 7 0 / 1 pts Use the following information: Probability w1 w2 w3 state1 0.3333333...
IncorrectQuestion 7 0 / 1 pts Use the following information: Probability w1 w2 w3 state1 0.3333333 15.00% 8.00% 3.00% state2 0.3333333 9.00% 5.00% 8.00% state3 0.3333333 12.00% 7.00% 4.00% % wealth invested 33.3333% 33.3333% 33.3333% Assuming that the market rate of return over the next year is expected to be 6.5% and that the risk-free rate is expected to be 2.5%, what must the beta for security A be in order for the CAPM to hold?    between 2.3 and...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT