We have the cipher text (xs)0 £ s < N which is encrypted by the Vigenere method using the key (kr)0 £ r < K . Here xs Î A, A being the alphabet used. We assume that A = ZS, where S is the size of the alphabet. We do all arithmetic in this ZS.
We define the subsequences Xkj = (xkn+j)n = 0... of the ciphertext. For example X3,2 = (x2,x5,x8,...)
Each sequence has a probability distribution Hkj where for each x Î A, Hkj(x) is the probability of occurence of x in Xkj.
Each probabability distribution has its index of coincidence defined
as
|
If k ¹ K we expect that Xkj is more or less random. So IC(Hkj) must be close to the index of coincidence of random text on A, which is 1/|A|.
If k = K, then each ciphertext letter belonging to Xkj has been encrypted using the same key letter, so IC(Hkj) must be closer to the index of coincidence of the original text. We also know that the distribution of real human languages is not random and that the index of coincidence for these is considerably larger than 1/|A|.
So, if we put a reasonable upper bound for the key length, we can calculate all the IC(Hkj) up to this upper bound, and use the maximum value we get to determine the key length.
After we find the key length we concentrate on the subsequences Xkj with k = K, the key length.
These are encrypted using different letters of the key, so probably have different distributions.
The difference between two distributions is measured using the mutual
index of coincidence.
|
Now suppose that the original text is U = (us). We have
| ||||||||||
Let
|
That is XKi(t) is the encryption of XKi using the shift cipher with key t. If ki+t = kj then the sequences XKi(t) and XKi would be encrypted by the same shift value, namely ki+t = kj. Therefore they would have more or less the same distribution. In contrast, if ki+s ¹ kj, the distributions XKi(s) and XKi would not be the same.
The distribution of of XKi(t) can be calculated easily from the distribution of XKi: Every x in XKi corresponds to a x+t in XKi(t). So we have HKi(x) = HKi(t)(x+t) or HKi(t)(x) = HKi(x-t).
Now we can calculate the following for all possible values of t:
|
By these considerations, the maximum of these values must occur when we have ki+t = kj or kj-ki = t. These give us several relations between the keys letters that enables us to find the key up to the first letter. Then the actual key can be easily found by trial and error.