Question

In: Advanced Math

Put a metric ρ on all the words in a dictionary by defining the distance between...

Put a metric ρ on all the words in a dictionary by defining the distance between two distinct words to be 2^−n if the words agree for the first n letters and are different at the (n+1)st letter. A space is distinct from a letter. E.g., ρ(car,cart)=2^−3 and ρ(car,call)=2^−2.

a) Verify that this is a metric.
b) Suppose that words w1, w2 and w3 are listed in alphabetical order.
Find a formula for ρ(w1,w3) in terms of ρ(w1,w2) and ρ(w2,w3).

Solutions

Expert Solution

Given: The metric defined on the set of all the words as:

a) To verify that this is a metric we need to check for the four properties of a metric space:

1. Non-negativity: As the metric outputs either 0 or 2-n, both of which are not negative for n>=0, thus non-negativity holds.

2. Coincidence: Let the distance between two words, say x and y, be zero,i.e., . Now,as 2-n>0 for all n=0,1,2,... Hence the distance can only be zero when x=y, as in that case the second definition of the metric function will hold, and the distance will be zero.

Conversely, let x=y, then by definition, . Thus the given metric satisfies .

Hence, coincidence holds.

3. Symmetry: Here we consider two cases. First, if x=y, then clearly Secondly, if they are unequal, then we have:

, where n is the integer such that x and y agree for first n letters and differ at (n+1)st letter.

But this n is not affected if we compare x with y, or y with x, as the number of characters to which they are the same will remain unaffected upon interchanging the two words. Thus will also be equal to 2-n.

Hence, in either case, . Thus, coincidence holds.

4. Triangle inequality: If x,y and z be any three words, then we need to show that:

.

If at least two of the three words are equal, then (1) is trivially true. For example, say x=z, then. The other cases when one of the words is equal to the other can be dealt with similarly.

So, now we need to show that (1) holds when x,y, and z are all distinct. As x,y and z are all distinct, we may assume that x and y can be ordered alphabetically as x,y; now, z has three possible positions to occupy: (z,x,y), (x,z,y) and (x,y,z). Let the n as given in the definition of the metric be as:

n1 - For the words x and y

n2 - For the words y and z

n3 - For the words z and x

First case: Let the alphabetical ordering be (z,x,y). Since z lies before both x and y, hence we have, because of the definition of the metric and the ordering being (z,x,y), the words z and x will be the closest to each other, then the words x and y, and then z and y. Thus, . Hence we need to show that

From the fact that 2-n2 >0 implies that hence we have . This is exactly what was needed to be shown for this part.

Second case: Let the alphabetical ordering be (x,z,y). Since z lies between x and y, hence we have, because of the definition of the metric and the ordering being (x,z,y), the words z and x will be the closest to each other, then the words z and y, and then x and y. Thus, . Hence we need to show that

From the fact that 2-n2 >0 implies that hence we have . This is exactly what was needed to be shown for this part.

Third case: Let the alphabetical ordering be (x,y,z). Then, z lies beyond both x and y, hence if x and y have first k1 letters in common, and y and z have first k2 letters in common, and k be the minimum of k1 and k2, then, x and z will have first k letters in common. Hence, the distance(which is a decreasing function of the number of letters the two words have in common), will be equal to the maximum of the distance between (x,y) and (y,z). Thus we have:.

Using this inequality we have:

Hence, triangle inequality holds in every case.

B) By a similar logic as the third case of the triangle inequality above, we see that w3 lies beyond both w1 and w2, hence if w1 and w2 have first k1 letters in common, and w2 and w3 have first k2 letters in common, and k be the minimum of k1 and k2, then, w1 and w3 will have first k letters in common. Hence, the distance(which is a decreasing function of the number of letters the two words have in common), will be equal to the maximum of the distance between (w1,w2) and (w2,w3). Thus we have:


Related Solutions

Read from a file that contains a paragraph of words. Put all the words in an...
Read from a file that contains a paragraph of words. Put all the words in an array, put the valid words (words that have only letters) in a second array, and put the invalid words in a third array. Sort the array of valid words using Selection Sort. Create a GUI to display the arrays using a GridLayout with one row and three columns. The input file Each line of the input file will contain a sentence with words separated...
QUESTION : Read from a file that contains a paragraph of words. Put all the words...
QUESTION : Read from a file that contains a paragraph of words. Put all the words in an array, put the valid words (words that have only letters) in a second array, and put the invalid words in a third array. Sort the array of valid words using Selection Sort. Create a GUI to display the arrays using a GridLayout with one row and three columns. The input file Each line of the input file will contain a sentence with...
here i have a dictionary with the words and i have to find the words with...
here i have a dictionary with the words and i have to find the words with the largest size, (i mean word count for eg abdominohysterectomy) which have anagrams , like for example ,loop -> pool. Also , i have to sort them in alphabetical way so what could be the algorithm for that public class Anagrams {    private static final int MAX_WORD_LENGTH = 25;    private static IArray<IVector<String>> theDictionaries =        new Array<>(MAX_WORD_LENGTH);    public static void...
1) How, if at all, is the geographic distance between countries related to the volume of...
1) How, if at all, is the geographic distance between countries related to the volume of trade between them? Is there a sense in which the world is becoming increasingly “flat,” as Thomas Friedman claims? Is there a sense in which it isn’t? 2) Why are some sectors more efficient when they operate on a large scale? In what sense might this phenomenon have made things for more difficult for sub-Saharan African countries that wish to reap the gains from...
State a data structure that is suitable for storing (i) the words in a dictionary to...
State a data structure that is suitable for storing (i) the words in a dictionary to facilitate searching (ii) the set of folders you have in your computer Explain your choices.
. Differentiate between metric and non-metric scale of measurement by giving suitable examples. Give examples of...
. Differentiate between metric and non-metric scale of measurement by giving suitable examples. Give examples of statistical tools where the two types of data (metric and non-metric) are used.
Focal film distance is a. distance between object and central ray. b. the distance from the...
Focal film distance is a. distance between object and central ray. b. the distance from the x-ray tube to the object. c. the distance from the x-ray tube to the image receptor. d. the most reliable method for determining mAs and kVp.
Write the function letter_frequencies(text) that returns a dictionary containing all of the letter frequencies of all...
Write the function letter_frequencies(text) that returns a dictionary containing all of the letter frequencies of all letters occurring in the parameter text. For example: if __name__ == '__main__': d = letter_frequencies('hello, world') print(d) # show the contents of this dictionary will produce the following output: {'a': 0.0, 'b': 0.0, 'c': 0.0, 'd': 0.1, 'e': 0.1, 'f': 0.0, 'g': 0.0, 'h': 0.1, 'i': 0.0, 'j': 0.0, 'k': 0.0, 'l': 0.3, 'm': 0.0, 'n': 0.0, 'o': 0.2, 'p': 0.0, 'q': 0.0,'r': 0.1,...
A simple random sample of 20 pages from a dictionary is obtained. The numbers of words...
A simple random sample of 20 pages from a dictionary is obtained. The numbers of words defined on those pages are found, with the results n=20, x=52.8 words, s=16.6 words. Given that this dictionary has 1439 pages with defined words, the claim that there are more than 70,000 defined words is equivalent to the claim that the mean number of words per page is greater than 48.6 words. Use a 0.01 significance level to test the claim that the mean...
A simple random sample of 10 pages from a dictionary is obtained. The numbers of words...
A simple random sample of 10 pages from a dictionary is obtained. The numbers of words defined on those pages are​ found, with the results n=10​, x overbarx=66.1 ​words, s=16.3 words. Given that this dictionary has 1454 pages with defined​ words, the claim that there are more than​ 70,000 defined words is equivalent to the claim that the mean number of words per page is greater than 48.1 words. Use a 0.10 significance level to test the claim that the...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT