The Mathematics of Communication

## 5. Exercises

1. How many bits are needed to encode 4000 equally likely symbols? How about 5000?
2. Suppose an alphabet has 4 symbols: bear, raven, eagle, whale, occurring with probabilities p(bear) = 1/2, p(raven) = 1/4, p(eagle) = p(whale) = 1/8. What is the entropy of this set of probabilities? Devise an optimal binary code for this alphabet. Check that this is more efficient than using 00, 01, 10, 11 for the four symbols.
3. Suppose that in the bear, raven, eagle, whale language the probabilities of digraphs (2-symbol combinations) occur as follows:

 first\second bear raven eagle whale bear 7/16 1/16 0 0 raven 0 1/8 1/8 0 eagle 0 1/16 0 1/16 whale 1/16 0 0 1/16
Explain why BREW can occur, but BERW cannot (B = bear, etc). What is the entropy of this set of probabilities in bits/digraph? Explain why an optimal coding by digraphs can be more efficient than an optimal coding by symbols. (This is an example of data compression. For statistics on digraphs in English text, see Soukoreff and MacKenzie).

@ Copyright 2000, American Mathematical Society.