previous 6 Modern Times up V'ir Tbg n Frperg    This document as PDF next 8 The Hill Cipher

7 Twins and Triplets

Of course, one could increase the security of an an affine cipher using a different one on each letter, as the Vigenère cipher discussed in section 4 does. But this is extra effort with no payoff: such a cipher can be cracked by exactly the same methods, and it is more work to encipher.

However, because an affine cipher sends adjacent letters far apart, one way to improve the cipher is to work on multiple letters at a time, either in pairs, triplets, or larger groupings. Instead of assigning numbers to individual letters, we think of our message as made up of ``super-characters'', where each individual letter gives us one ``digit'' of the group.

For example, if we wanted our super-characters to be made up of pairs of two letters (called a digraph), we would think of the plaintext double as being made up of the three pairs of digraphs do, ub, and le. How many pairs of such digraphs do we have? Since we have 26 choices for the first letter and 26 for the second, there are 26 x 26 = 676 such pairs. The first one is aa and gets assigned the number 0, ab gets 1, ac has the code 2, and so on, up to zz which gets the character code 675. If the plaintext has an odd number of characters, we add an additional character onto the end (usually x, q, or z).

With so many digraphs, we certainly don't want to have to use a table to look up the numeric equivalences. But notice that we can readily figure out which letter pair gets which number by paying attention to place value. Remember that when we write the decimal number for six hundred seventy five, we write

675 =  6 x 100 +  7 x 10 +  5.

Here, we don't have ten ``digits'', we have twenty-six.11So to determine the code for do, we multiply the number for the single letter d by 26, and add the number for o. That is, it has the numeric code 3 x 26 +  14 = 92. Similarly, ub corresponds to 20 x 26 +  1 = 521, and le gives 290. To go from the number to its letter equivalent, we divide by 26; the dividend gives the first character, and the remainder gives the second. For example, to figure out the letters which correspond to 524, we divide 524 by 26. The result is 20, with a remainder of 4, so 524 corresponds to ue.


Let's encrypt the word blue using the same affine cipher as in section 6 (that is, with m=3 and b=20), but working with digraphs.

plaintext bl ue
x 1 x 26 + 11=37 20 x 26 +  4 = 524
3x+20 131 $1592 = 240 (\mbox{mod} 676)$
  131 = 5 x 26 + 1 240 = 9 x 26 +  6
ciphertext FB JG

Note that we can take m to be any number between 1 and 675 (as long as it is relatively prime to 26), and we can take b as anything between 0 and 675.


To break such a code, we can apply frequency analysis, but this time we must look at frequency of pairs of letters. This is still doable, but more challenging.


Notice that we can just as well use triplets of letters (trigraphs), or blocks of any length we choose. For example, if we were using 4-tuples, the word blue would be a single ``character'' and would have the numeric code

1 x 263 +  11 x 262 +  20 x 26 +  4  =  25536.

However, such numbers quickly become unwieldy unless one is using a computer.



Footnotes

... twenty-six.11
I suppose we could call these hexicosimal numbers (from the Greek for 26), but that's too weird, so we just say we are working in base 26.
previous 6 Modern Times up V'ir Tbg n Frperg    This document as PDF next 8 The Hill Cipher
Scott Sutherland
2005-10-26