In: Advanced Math
Put a metric ρ on all the words in a dictionary by defining the distance between two distinct words to be 2^−n if the words agree for the first n letters and are different at the (n+1)st letter. A space is distinct from a letter. E.g., ρ(car,cart)=2^−3 and ρ(car,call)=2^−2.
a) Verify that this is a metric.
b) Suppose that words w1, w2 and w3 are listed in alphabetical
order.
Find a formula for ρ(w1,w3) in terms of ρ(w1,w2) and ρ(w2,w3).
Given: The metric defined on the set of all the words as:
a) To verify that this is a metric we need to check for the four properties of a metric space:
1. Non-negativity: As the metric outputs either 0 or 2-n, both of which are not negative for n>=0, thus non-negativity holds.
2. Coincidence: Let the distance between two words, say x and y, be zero,i.e., . Now,as 2-n>0 for all n=0,1,2,... Hence the distance can only be zero when x=y, as in that case the second definition of the metric function will hold, and the distance will be zero.
Conversely, let x=y, then by definition, . Thus the given metric satisfies .
Hence, coincidence holds.
3. Symmetry: Here we consider two cases. First, if x=y, then clearly Secondly, if they are unequal, then we have:
, where n is the integer such that x and y agree for first n letters and differ at (n+1)st letter.
But this n is not affected if we compare x with y, or y with x, as the number of characters to which they are the same will remain unaffected upon interchanging the two words. Thus will also be equal to 2-n.
Hence, in either case, . Thus, coincidence holds.
4. Triangle inequality: If x,y and z be any three words, then we need to show that:
.
If at least two of the three words are equal, then (1) is trivially true. For example, say x=z, then. The other cases when one of the words is equal to the other can be dealt with similarly.
So, now we need to show that (1) holds when x,y, and z are all distinct. As x,y and z are all distinct, we may assume that x and y can be ordered alphabetically as x,y; now, z has three possible positions to occupy: (z,x,y), (x,z,y) and (x,y,z). Let the n as given in the definition of the metric be as:
n1 - For the words x and y
n2 - For the words y and z
n3 - For the words z and x
First case: Let the alphabetical ordering be (z,x,y). Since z lies before both x and y, hence we have, because of the definition of the metric and the ordering being (z,x,y), the words z and x will be the closest to each other, then the words x and y, and then z and y. Thus, . Hence we need to show that
From the fact that 2-n2 >0 implies that hence we have . This is exactly what was needed to be shown for this part.
Second case: Let the alphabetical ordering be (x,z,y). Since z lies between x and y, hence we have, because of the definition of the metric and the ordering being (x,z,y), the words z and x will be the closest to each other, then the words z and y, and then x and y. Thus, . Hence we need to show that
From the fact that 2-n2 >0 implies that hence we have . This is exactly what was needed to be shown for this part.
Third case: Let the alphabetical ordering be (x,y,z). Then, z lies beyond both x and y, hence if x and y have first k1 letters in common, and y and z have first k2 letters in common, and k be the minimum of k1 and k2, then, x and z will have first k letters in common. Hence, the distance(which is a decreasing function of the number of letters the two words have in common), will be equal to the maximum of the distance between (x,y) and (y,z). Thus we have:.
Using this inequality we have:
Hence, triangle inequality holds in every case.
B) By a similar logic as the third case of the triangle inequality above, we see that w3 lies beyond both w1 and w2, hence if w1 and w2 have first k1 letters in common, and w2 and w3 have first k2 letters in common, and k be the minimum of k1 and k2, then, w1 and w3 will have first k letters in common. Hence, the distance(which is a decreasing function of the number of letters the two words have in common), will be equal to the maximum of the distance between (w1,w2) and (w2,w3). Thus we have: