In: Computer Science
While working with the huffman encoding of n letters with frequencies g1, g2, g3, g4,....gn. Please state below the maximum length of a codeword possible. Please show proof to justify your answer.
No huffman code can be longer than alphabetsize-1. Proof: it is impossible to construct a binary tree with N nodes and more than N-1 levels.
The maximum length of the code also depends on the number of samples you use to derive your statistics from; the sequence is as follows (the samples include the fake samples to give each symbol a nonzero probability!):
Maximum Length of a Huffman Code
Apart from the ceil(log2(alphabetsize)) boundary for the nonzero bits in this particular canonical huffman code it is useful to know the maximum length a huffman code can reach. In fact there are two limits which must both be fulfilled.
A Huffman code is built by creating a binary tree with symbols at the leaves. The length of a code is the length of the path from the root to each leaf (actually, the number of interior nodes). So to get the longest code, we would want the most unbalanced tree:
o
/ \
o S
/ \
o S
/ \
... S
If we have N symbols, the longest path has N-1 nodes, so the longest possible code is N-1 bits.