Question

In: Computer Science

Messages comprising seven different characters, A through G, are to be transmitted over a data link....

Messages comprising seven different characters, A through G, are to be transmitted over a data link. Analysis has shown that the related frequency of occurrence of each character is A 0.10, B 0.25, C 0.05, D 0.32, E 0.01, F 0.07, G 0.2
i) Derive the entropy of the messages
ii) Use static Huffman coding to derive a suitable set of codewords.
iii) Derive the average number of bits per codeword for your codeword set to transmit a message and compare this with both the fixed-length binary and ASCII codewords.

iv) State the prefix property of Huffman Coding and hence show that your codeword set derived in the exercise satisfied this.
v) Derive a flowchart for an algorithm to decode a received bit string encoded using your codeword set.
vi) Give an example of the decoding operation assuming the received bit string comprises a mix of seven characters

Solutions

Expert Solution

Note: I assume you know Huffman Encoding, if have any doubt or find difficulty in understanding anything, leave a comment.

So, we are given the following characters with their probability of occurrence as follows:

A 0.10

B 0.25

C 0.05

D 0.32

E 0.01

F 0.07

G 0.20

Ans i)

Entropy = where pi is the probability of the chracter

Character pi log(1/pi)
A 0.1 3.32
B 0.25 2
C 0.05 4.32
D 0.32 1.64
E 0.01 6.64
F 0.07 3.83
G 0.2 2.32

Entropy = (0.1 * 3.32) + (0.25 * 2) + (0.05 * 4.32) + (0.32 * 1.64) + (0.01 * 6.64) + (0.07 * 3.83) + (0.2 * 2.32)

=> 0.332 + 0.5 + 0.216 + 0.524 + 0.066 + 0.268 + 0.622

=> 2.525 bits

So, the entropy of the message is 2.525 bits

Ans ii)

Huffman Tree is shown below:

Codewords:

A -- 010

B -- 10

C -- 01101

D -- 11

E -- 01100

F -- 0111

G -- 00

Ans iii)

Average no. of bits per crowdword (Huffman Encoding)

Sum of (Bit taken by character * Given probability) / 1 (Total sum of probabilities)

=> ( ( 0.1 * 3 ) + ( 0.25 * 2 ) + ( 0.05 * 5 ) + ( 0.32 * 2 ) + ( 0.01 * 5 ) + ( 0.07 * 4 ) + ( 0.20 * 2 ) ) / (0.1 + 0.25 + 0.05 + 0.32 + 0.01 + 0.07 + 0.20)

=> 2.42 / 1 => 2.42

Average no. of bits per crowdword (Fixed-length binary)

We are given 7 characters, so minimum bits require to represent each character uniquely

2n > 7

gives n = 3

So, we require 3 bits to represent each character uniquely.

So, (0.1 + 0.25 + 0.05 + 0.32 + 0.01 + 0.07 + 0.20) * 3 = 3

Average no. of bits per crowdword (ASCII codewords)

One character is represented by 8 bits in ASCII codeword

So, average no. of bits = 8 * 1 =  8

Message length reduced by:

Compared with Fixed-length binary:

(3 - 2.42) / 3 = 0.19 = 19%

Compared with ASCII codewords:

(8 - 2.42) / 8 = 0.69 = 69%

Ans iv) Prefix Property of Huffman Coding: It states that no codeword is prefix to another codeword. or It states that when we form a message using above formed codewords, we must be able to uniquely identify each character. So, if we use Huffman coding, we'll always be able to uniquely identify each character, wherever it has been used in the string.

So, lets check this for our solution:

We can clearly see that no combination of the bits is prefix to any codeword.

Let, we start with the most significant bit and let it be 0, now if second most significant bit is 0 means, G is encountered, and it second most significant bit is 1 it can be either from A, C, E, or F.

Let's move to the third most significant bit, if it is 0 means A is encountered, else in case of 1, we can have C, E or F. So like this we can tally for all the characters and we will always find that no codeword is prefix to any other codeword.

Let take and example:  0000011000001001101

We can break it into  00 00 01100 00 010 01101

i.e G G C G A C = GGCGAC

Ans v) Refer to the following flowchart.

If we find a character, we stop and start with the c0, in search of new character.

  

I haven't used that box representation, as it could made difficult in understanding due to various lines.

Ans vi)

Eg.   "011001100011011001011"

Using the above algorithm/ flowchart break it into the characters.

Let's go for first character

c0 = 0 (0)

c1 = 1 (01)

c2 = 1 (011)

c3 = 0 (0110)

c4 = 0 (01100) (E found)

Check for all the characters in the same manner.

So input will break in 011001100011011001011

01100 11 00 01101 10 010 11

E D   G C B A D

String is EDGCBAD


Related Solutions

A stream of bits, 01111011111011111101111100, needs to be transmitted at the data link layer using bit...
A stream of bits, 01111011111011111101111100, needs to be transmitted at the data link layer using bit stuffing, what is actually transmitted after the bit stuffing? 2) We are transmitting 16-bit data using a Hamming code. What is the minimum number of check bits is needed to ensure that the receiver can correct a single-bit error?
Suppose that x bits of user data are to be transmitted over a k-hop path in...
Suppose that x bits of user data are to be transmitted over a k-hop path in a packetswitched network as a series of packets, each containing p data bits and h header bits, with x >> p + h. The bit rate of the lines is b bps and the propagation delay is negligible. What value of p minimizes the total delay? Assume that the header size (h bits) is NOT included in the data size (p bits)
A company wants to transmit data over the telephone, but is concerned that its phones could be tapped. All of the data are transmitted as four-digit integers.
  A company wants to transmit data over the telephone, but is concerned that its phones could be tapped. All of the data are transmitted as four-digit integers. The company has asked you to write a program that encrypts and decrypts the data so that it can be transmitted more securely. Your program should read a four-digit integer and encrypt it as follows: Replace each digit by (the sum of that digit plus 7) modulus 10. Then, swap the first...
Over the past seven weeks, we have explored different decades of US history and applied principles...
Over the past seven weeks, we have explored different decades of US history and applied principles of macroeconomics to their outcomes. Take this time to share what you have learned with your classmates. In your initial post, respond to the following: Choose one macroeconomic concept you applied in your final project of the years 2000-2010. Explain how it helps describe the economic outcomes of the decade you researched during 2000-2010.
Over the past seven weeks, we have explored different decades of US history and applied principles...
Over the past seven weeks, we have explored different decades of US history and applied principles of macroeconomics to their outcomes. Take this time to share what you have learned with your classmates. In your initial post, respond to the following: Choose one macroeconomic concept you applied in your final project of the years 2000-2010. Explain how it helps describe the economic outcomes of the decade you researched during 2000-2010.
Provide data showing the trend of interest rate change in South Africa over the past seven...
Provide data showing the trend of interest rate change in South Africa over the past seven years. Analyse whether the economic theories attempting to explain and predict the effects of changes in this variable on the macroeconomy are consistent with the facts
A survey collected data on annual credit card charges in seven different categories of expenditures:
  A survey collected data on annual credit card charges in seven different categories of expenditures: transportation, groceries, dining out, household expenses, home furnishings, apparel, and entertainment. Using data from a sample of 42 credit card accounts, assume that each account was used to identify the annual credit card charges for groceries (population 1) and the annual credit card charges for dining out (population 2). Using the difference data, with population 1 − population 2, the sample mean difference was...
G-2 Inc. expects the following dividend pattern over the next seven​ years: Year 1- $1.50 Year...
G-2 Inc. expects the following dividend pattern over the next seven​ years: Year 1- $1.50 Year 2 - $1.56 Year 3 - $1.62 Year 4 - $1.68 Year 5- $1.75 Year 6 - $1.82 Year 7 ​- $1.90 The company will then have a constant dividend of ​$ 1.97 forever. What is the price of this stock today​ (year 0) if an investor wants to earn 16​% rate of​ return? The stock price is ​$_____   ​(Round to two decimal​ places.)
Problem 5-09 The table below shows data on the returns over five 1-year periods for seven...
Problem 5-09 The table below shows data on the returns over five 1-year periods for seven mutual funds. A firm's portfolio managers will assume that one of these scenarios will accurately reflect the investing climate over the next 12 months. The probabilities of each of the scenarios occurring are 0.1, 0.3, 0.1, 0.1, and 0.4 for years 1 to 5, respectively. RETURNS OVER FIVE 1-YEAR PERIODS FOR SEVEN MUTUAL FUNDS Planning Scenarios for Next 12 Months Mutual Funds Year 1...
Bank of America's Consumer Spending Survey collected data on annual credit card charges in seven different...
Bank of America's Consumer Spending Survey collected data on annual credit card charges in seven different categories of expenditures: transportation, groceries, dining out, household expenses, home furnishings, apparel, and entertainment (U.S. Airways Attache, December 2003). Using data from a sample of 42 credit card accounts, assume that each account was used to identify the annual credit card charges for groceries (population 1) and the annual credit card charges for dining out (population 2). Using the difference data, the sample mean...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT