This was a challenging homework, but I hope you enjoyed it. The average grade was 85. Below are several best solutions. Few more excellent solutions to this homework will appear in a short while. Again, some people submitted excellent works by e-mail or in writing, so they can't be shown here.
1.Vigenere
It was straightforward,
exactly as seen in the class. This text contains many three and four
letter group repetitions,
mainly because the keyword is 'MAINTAIN' which is almost
4-letter periodic. Kasiski
method suggests a period of 8 or 4. The rest can be done one of the
following methods by:
a. mutual IC technique, checking
relative shifts of the columns.
b. guessing the most frequent
trigrams and filling the rest (ING, THE).
c. guessing the 8-letter
password in a hope that it is a word from the dictionary.
d. curve fitting by shifting
the frequency histogram (see solution
by Gabriel Wainmann(MSWORD)).
e. hill climbing with log-normal
trigrams.
All methods were used. Many people solved it by hand, using approach 'b'.
2. Grilles
Anagramming attacks or probable
word attacks as in handout from H.Gaines book.
Solution took from 1-10 hours
by hand. In most of the cases 'ELEPHANT' was easily located
and the 180 degree rotation
was giving another 8-letter portion of the plaintext.
For hand solution preparing
four work sheets with the cryptogram (for each rotation) helps a lot.
Also marking concentric squares
in the grille helps to locate the holes.
Several people solved it with
hill
climbing programs.
a. Here is the program
written by Omer Golz and hand
solution assistant (EXE, in Delphi).
b. Here is the hill
climber of Roman Dovgard.
c. Here is the hill
climber of Eran Ofek and Boaz Barak (single compressed
TAR archive).
3. ADFGX
This was THE problem. An important
substitution-transposition product cipher, predecessor of modern
ciphers. It could be solved in
several steps:
a. Define the period. Since complete columnar transposition is used, the period divides the message length 308. Thus it can be only 2,4,7,11,14,22,28. The periods of 7,11,14 and 22 look as the most plausible since the rest is either too short or too long (actually in Painvin's case the period could be from 15-25).
b. We reformat the messages with
a text editor according to the period guess. Two directions are possible:
the last row for the padding (common
ending technique) in both messages. One immediately sees that
for the period of 14 the last row
of the 1st message contains only letters X and A. Thus padding character
is
either XA or AX. Additional bonus
of this discovery is the odd-even column sets. Thus instead of
14! transpositions one is left
with 2*7!*7!. Moreover from the padding of the second message 4 last
columns are identified. The period
of 14 was intentionally chosen to avoid brute force solution.
c. Now one can try 7! =5040 possible
pairings. A result will be a transposed monoalphabetic cipher which
can be tested very reliably with
IC statistics (2*308 letters). The same can be done by hand using the
IC calculator. One needs to check
about 7+6+5+4+3+2+1 = 28 pairs of columns. 42 characters (from both
messages are enough to check for
monoalphabeticity, leaving only few false alarms. On the other hand
working with a single message (21
letters) is not enough.
d. At this point we are left with
a transposed simple substitution cipher. The most popular method
was to use the structure of the
ADFGX matrix -- lower rows contain last letters of the alphabet either
in their correct places or slightly
shifted forward. The upper rows which contain the keyword tend to have
more frequent letters then the
lower rows. This together with a letter frequency data of the messages
was enough.
Solution Without Padding
Extensive padding leaked lots of
information about the key. How one can analyze this cipher
if no-padding or random padding
is used?
a. Guess the period
b. Ignore the last row of the text
which contains padding and spoils the frequencies
c. Note that odd and even columns
have different frequencies of letters A,D,F,G,X.
Letter X is much more frequent
as the second letter then the first one. In general
nothing prohibits us to calculate
IC statistic based only on five letters A,D,F,G,X.
If this is done for both messages
glued together one immediately identifies the
odd and even columns (for example:
16,5,6,6,12 or 15,7,8,6,9 for some columns and
16,12,5,6,5 or 17,10,2,11,5 for
the others). 42 characters are enough in order to do it.
Thus one can identify the odd and
even columns. The rest is as in the previous solution.
Note that if the period is odd then
first and last letters of the pair appear in checkerboard-like
pattern and not in columns. This
is slightly harder for the first phase of the analysis but
leaks more information about the
transposition at a later stage.
4A.Enigma - Method of Batons
This was straightforward: just follow
the explanations given in the class + use the handout
from Deavours, Kruh book. It is
important to work with absolute (keyboard) alphabet and not
with rotor-related alphabet. Most
of the class has solved it.
4B.Enigma - Discover the wires
This was another (together with
3) very important problem in this homework. After I posted this problem
I found that Turing in his
notes addressed similar problem (he assumed the reflector is also unknown).
The most common solution was to
use the powerful Polish methods which were shown in the class.
Solutions ranged from 195-260 queries.
Several important things to add
1. The number of queries can be
as low as 84 = 4*12+3*12. One needs 4 alphabets to write
the equations for R1. The two final
equations provide 26 variants for the R1. The indicator
setting has to be changed to, say
'AAA' 26 times. One queries:
AAAA
BBBB
CCCC
.....
In order to find R2 one can reuse
26 queries already made for R1, and thus needs 3*26 new
queries.
2. There is no need to make 26 queries,
13
are enough (involutions). 12
are enough, since the last
pair is just there.
3. Once received 26 variants of
rotor R1 we can
find the correct one that corresponds
to our indicator setting. Using
no additional queries and applying "baton"-like method.
This was impossible in Rejewski's
case since plugboard and ring setting were unknown. In our case there
is no plugboard, so we are exactly
in the case fit for "method of batons" attack.
Without finding the correct version
of R1 it is hard to continue the attack!
4. After R1 and R2 are found, since
reflector R is known we just write down all the possibilities
for R3. No
additional queries. In the worst case there
will be 2*3^8 =13122 possible solutions. Only three - four additional queries
can fix a single correct solution.
5. So the number of indicator changes
is about 4*13 = 52,
the number of queries 84+3=87.
Armed with Shannon's theory we
can calculate the lower bound: the amount of information
in three rotors is 3 log(26!)
= 56 letters (here log is to basis 26).
Another Method
Uri Barenholz has invented a
method of "Probing" to recover the rotor wires. The number of queries
is not
as small as in the method above
(although I believe that many refinements are possible) however it is a
very interesting method. It works only for plugless Enigma.