Question

In: Statistics and Probability

Gaussian Mixture Model: the initial means and variances of two clusters in a GMM are as...

Gaussian Mixture Model:

the initial means and variances of two clusters in a GMM are as follows: ?(1)=−3, ?(2)=2, ?21=?22=4. Let ?1=?2=0.5.

Let ?(1)=0.2, ?(2)=−0.9, ?(3)=−1, ?(4)=1.2, ?(5)=1.8 be five points that need to cluster.

Need to find

1) p(1|1)

2) p(1|2)

3) p(1|3)

4) p(1|4)

5) p(1|5)

Solutions

Expert Solution

Initial derivations

We are now going to introduce some additional notation. Just a word of warning. Math is coming on! Don’t worry. I’ll try to keep the notation as clean as possible for better understanding of the derivations. First, let’s suppose we want to know what is the probability that a data point xn comes from Gaussian k. We can express this as:Which reads “given a data point x, what is the probability it came from Gaussian k?” In this case, z is a latent variable that takes only two possible values. It is one when x came from Gaussian k, and zero otherwise. Actually, we don’t get to see this z variable in reality, but knowing its probability of occurrence will be useful in helping us determine the Gaussian mixture parameters, as we discuss later.

Likewise, we can state the following:

Which means that the overall probability of observing a point that comes from Gaussian k is actually equivalent to the mixing coefficient for that Gaussian. This makes sense, because the bigger the Gaussian is, the higher we would expect this probability to be. Now let z be the set of all possible latent variables z, hence:

We know beforehand that each z occurs independently of others and that they can only take the value of one when k is equal to the cluster the point comes from. Therefore:

Now, what about finding the probability of observing our data given that it came from Gaussian k? Turns out to be that it is actually the Gaussian function itself! Following the same logic we used to define p(z), we can state:

Ok, now you may be asking, why are we doing all this? Remember our initial aim was to determine what the probability of z given our observation x? Well, it turns out to be that the equations we have just derived, along with the Bayes rule, will help us determine this probability. From the product rule of probabilities, we know that

Hmm, it seems to be that now we are getting somewhere. The operands on the right are what we have just found. Perhaps some of you may be anticipating that we are going to use the Bayes rule to get the probability we eventually need. However, first we will need p(xn), not p(xn, z). So how do we get rid of z here? Yes, you guessed it right. Marginalization! We just need to sum up the terms on z, hence

This is the equation that defines a Gaussian Mixture, and you can clearly see that it depends on all parameters that we mentioned previously! To determine the optimal values for these we need to determine the maximum likelihood of the model. We can find the likelihood as the joint probability of all observations xn, defined by:

Like we did for the original Gaussian density function, let’s apply the log to each side of the equation:

Great! Now in order to find the optimal parameters for the Gaussian mixture, all we have to do is to differentiate this equation with respect to the parameters and we are done, right? Wait! Not so fast. We have an issue here. We can see that there is a logarithm that is affecting the second summation. Calculating the derivative of this expression and then solving for the parameters is going to be very hard!

What can we do? Well, we need to use an iterative method to estimate the parameters. But first, remember we were supposed to find the probability of z given x? Well, let’s do that since at this point we already have everything in place to define what this probability will look like.

From Bayes rule, we know that

From our earlier derivations we learned that:

So let’s now replace these in the previous equation:

And this is what we’ve been looking for! Moving forward we are going to see this expression a lot. Next we will continue our discussion with a method that will help us easily determine the parameters for the Gaussian mixture.


Related Solutions

S = {(2,5,3)} and T = {(2,0,5)} are two clusters. Two clusters that S and T...
S = {(2,5,3)} and T = {(2,0,5)} are two clusters. Two clusters that S and T spans are L(S) and L(T) . Is the intersection of L (S) and L (T) a vector space? If yes, find this vector space. If no, explain why there is no vector space.
Description of the main industry clusters in Doha(Qatar). Use the agglomeration economies model to explain clusters...
Description of the main industry clusters in Doha(Qatar). Use the agglomeration economies model to explain clusters in Doha(e.g. the presence of input sharing, labor pooling, and knowledge spillovers).
Compare and contrast the properties of open clusters and globular clusters. Indicate at least TWO way...
Compare and contrast the properties of open clusters and globular clusters. Indicate at least TWO way that they are similar? Indicate at least TWO way that they are different?
1. Hypothesis Test for Two Population Means, Independent Samples, Population Variances Unknown and Not Assumed to...
1. Hypothesis Test for Two Population Means, Independent Samples, Population Variances Unknown and Not Assumed to be Equal (5 points) Group Statistics Specialty N Mean Std. Deviation Std. Error Mean Enrollment Research 17 596.235 163.2362 39.5906 Primary Care 16 481.500 179.3957 44.8489 Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means F Sig. t df Sig. (2-tailed) Mean Difference Std. Error Difference 90% Confidence Interval of the Difference Lower Upper Enrollment Equal variances assumed .077...
Two independent random samples were selected from two normally distributed populations with means and variances (μ1,σ21)...
Two independent random samples were selected from two normally distributed populations with means and variances (μ1,σ21) and (μ2,σ22). The sample sizes, means and variances are shown in the following table. Sample 1 n1 = 13 x̄1 = 18.2 s21 = 75.3 Sample 2 n2 = 14 x̄2 = 17.1 s2= 61.3 (a). Test H0 : σ12 = σ2against Ha : σ12 ̸= σ2. Use α = 0.05. Clearly show the 4 steps. (b). TestH0 :μ1 −μ2 =0againstHa :μ1 −μ2 >0....
Use the k-means algorithm and Euclidean distance to cluster the following eight examples into three clusters:...
Use the k-means algorithm and Euclidean distance to cluster the following eight examples into three clusters: A1 = (26, 18), A2 = (20, 26), A3(14, 20), A4(24, 20), A5(14, 30), A6(22, 18), A7(8, 18), A8(12, 14) a. Suppose that the initial seeds (centres of each cluster) are A2, A3, and A8. Run the k-means algorithm for one epoch only. At the end of this epoch, o show the new clusters (i.e., the examples belonging to each cluster); o show the...
Do a two-sample test for equality of means assuming unequal variances. Calculate the p-value using Excel....
Do a two-sample test for equality of means assuming unequal variances. Calculate the p-value using Excel. (a-1) Comparison of GPA for randomly chosen college juniors and seniors: x¯1x¯1 = 4.75, s1 = .20, n1 = 15, x¯2x¯2 = 5.18, s2 = .30, n2 = 15, α = .025, left-tailed test. (Negative values should be indicated by a minus sign. Round down your d.f. answer to the nearest whole number and other answers to 4 decimal places. Do not use "quick"...
Do a two-sample test for equality of means assuming unequal variances. Calculate the p-value using Excel....
Do a two-sample test for equality of means assuming unequal variances. Calculate the p-value using Excel. (a-1) Comparison of GPA for randomly chosen college juniors and seniors: x¯1x¯1 = 4.75, s1 = .20, n1 = 15, x¯2x¯2 = 5.18, s2 = .30, n2 = 15, α = .025, left-tailed test. (Negative values should be indicated by a minus sign. Round down your d.f. answer to the nearest whole number and other answers to 4 decimal places. Do not use "quick"...
Do a two-sample test for equality of means assuming unequal variances. Calculate the p-value using Excel....
Do a two-sample test for equality of means assuming unequal variances. Calculate the p-value using Excel. (a-1) Comparison of GPA for randomly chosen college juniors and seniors: x⎯⎯1x¯1 = 4, s1 = .20, n1 = 15, x⎯⎯2x¯2 = 4.25, s2 = .30, n2 = 15, α = .025, left-tailed test. (Negative values should be indicated by a minus sign. Round down your d.f. answer to the nearest whole number and other answers to 4 decimal places.)   d.f.      t-calculated   ...
Do a two-sample test for equality of means assuming unequal variances. Calculate the p-value using Excel....
Do a two-sample test for equality of means assuming unequal variances. Calculate the p-value using Excel. (a-1) Comparison of GPA for randomly chosen college juniors and seniors: x⎯⎯1x¯1 = 4.05, s1 = .20, n1 = 15, x⎯⎯2x¯2 = 4.35, s2 = .30, n2 = 15, α = .025, left-tailed test. (Negative values should be indicated by a minus sign. Round down your d.f. answer to the nearest whole number and other answers to 4 decimal places. Do not use "quick"...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT