Question

In: Computer Science

Consider a simple case where we have trained a NN for recognising patterns X1 to Xn...

Consider a simple case where we have trained a NN for recognising patterns X1 to Xn belonging to some classes, say A and B. Suppose we have a new training data Xn+1 that belongs to class C. We want this also to be trained. Will the previous training be impacted? How would it work?

Solutions

Expert Solution

So, according to you the existing classes are A and B. In this case, when you have a new training instance Xn+1 that belongs to class C, it is possible to train this instance as well, without effecting the previous trained model.

Your pre-trained network has a layer, which handles the recognition of 2 original classes. The easiest (and working) trick to introduce the new class, is to use all the layers before the last as granted and add an additional layer (in a new model, or as a parallel one).

Without access to original training data, you would have two options:

  1. Freeze all the weights in original layers by allowing "new" model to optimize only new weights. That will give you exactly same predictive power for original 2 classes and might give OK performance for new ones.
  2. Train whole network at once (by propagating error of new classes), which might be working for new class(es), but you will end up with ineffective original solution for 2 classes (since weights will be changed for the lower classes and final layer won't be updated to match those changes).

Related Solutions

Consider a simple case where we have trained a NN for recognising patterns X1 to Xn...
Consider a simple case where we have trained a NN for recognising patterns X1 to Xn belonging to some classes, say A and B. Suppose we have a new training data Xn+1 that belongs to class C. We want this also to be trained. Will the previous training be impacted? How would it work?
We have a random sample of observations on X: x1, x2, x3, x4,…,xn. Consider the following...
We have a random sample of observations on X: x1, x2, x3, x4,…,xn. Consider the following estimator of the population mean: x* = x1/2 + x2/4 + x3/4. This estimator uses only the first three observations. a) Prove that x* is an unbiased estimator. b) Derive the variance of x* c) Is x* an efficient estimator? A consistent estimator? Explain.
Consider a random sample (X1, Y1), (X2, Y2), . . . , (Xn, Yn) where Y...
Consider a random sample (X1, Y1), (X2, Y2), . . . , (Xn, Yn) where Y | X = x is modeled by Y=β0+β1x+ε, ε∼N(0,σ^2), where β0,β1and σ^2 are unknown. Let β1 denote the mle of β1. Derive V(βhat1).
Suppose we have a random sample of n observations {x1, x2, x3,…xn}. Consider the following estimator...
Suppose we have a random sample of n observations {x1, x2, x3,…xn}. Consider the following estimator of µx, the population mean. Z = 12x1 + 14x2 + 18x3 +…+ 12n-1xn−1 + 12nxn Verify that for a finite sample size, Z is a biased estimator. Recall that Bias(Z) = E(Z) − µx. Write down a formula for Bias(Z) as a function of n and µx. Is Z asymptotically unbiased? Explain. Use the fact that for 0 < r < 1, limn→∞i=1nri...
Suppose X1,···, Xn ∼ Exp(λ) are independent. What is the distribution of X1/S where S =...
Suppose X1,···, Xn ∼ Exp(λ) are independent. What is the distribution of X1/S where S = X1+X2+···+Xn? Please show me how to do this without using the property of chi-squared dist.
Suppose X1,···, Xn ∼ Exp(λ) are independent. What is the distribution of X1/S where S =...
Suppose X1,···, Xn ∼ Exp(λ) are independent. What is the distribution of X1/S where S = X1+X2+···+Xn? Please show me how to do this without using the property of chi-squared dist.
Let X1, . . . , Xn ∼ iid Exp(θ) and consider the test for H0...
Let X1, . . . , Xn ∼ iid Exp(θ) and consider the test for H0 : θ ≥ θ0 vs H1 : θ < θ0. (a) Find the size-α LRT. Express the rejection region in the form of R = {X > c ¯ } where c will depend on a value from the χ 2 2n distribution. (b) Find the appropriate value of c. (c) Find the formula for the P-value of this test. (d) Compare this test...
Consider the independent observations x1, x2, . . . , xn from the gamma distribution with...
Consider the independent observations x1, x2, . . . , xn from the gamma distribution with pdf f(x) = (1/ Γ(α)β^α)x^(α−1)e ^(−x/β), x > 0 and 0 otherwise. a. Write out the likelihood function b. Write out a set of equations that give the maximum likelihood estimators of α and β. c. Assuming α is known, find the likelihood estimator Bˆ of β. d. Find the expected value and variance of Bˆ
Consider n numbers x1, x2, . . . , xn laid out on a circle and...
Consider n numbers x1, x2, . . . , xn laid out on a circle and some value α. Consider the requirement that every number equals α times the sum of its two neighbors. For example, if α were zero, this would force all the numbers to be zero. (a) Show that, no matter what α is, the system has a solution. (b) Show that if α = 1 2 , then the system has a nontrivial solution. (c) Show...
Consider two sets of integers, X = [x1, x2, . . . , xn] and Y...
Consider two sets of integers, X = [x1, x2, . . . , xn] and Y = [y1, y2, . . . , yn]. Write two versions of a FindUncommon(X, Y ) algorithm to find the uncommon elements in both sets. Each of your algorithms should return an array with the uncommon elements, or an empty array if there are no uncommon elements. You do not have to write the ‘standard’ algorithms – just use them. Therefore, you should be...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT