homeworks Solutions to Homework 2
"The HALL of FAME"
Problem 3 (ADFGX) was solved completely by:
James Jillogly, Omer Holz, Kobi Kaminitz, Paul Koning, Nevo Laron

This was a challenging homework, but I hope you enjoyed it. The average grade was 85. Below are several best solutions. Few more excellent solutions to this homework will appear in a short while. Again, some people submitted excellent works by e-mail or in writing, so they can't be shown here.

Brief Comments

1.Vigenere
  It was straightforward, exactly as seen in the class. This text contains many three and four
  letter group repetitions, mainly because the keyword is 'MAINTAIN' which is almost
  4-letter periodic. Kasiski method suggests a period of 8 or 4. The rest can be done one of the
  following methods by:

  a. mutual IC technique, checking relative shifts of the columns.
  b. guessing the most frequent trigrams and filling the rest (ING, THE).
  c. guessing the 8-letter password in a hope that it is a word from the dictionary.
  d. curve fitting by shifting the frequency histogram (see solution by Gabriel Wainmann(MSWORD)).
  e. hill climbing with log-normal trigrams.

  All methods were used. Many people solved it by hand, using approach 'b'.

2. Grilles
 Anagramming attacks or probable word attacks as in handout from H.Gaines book.
 Solution took from 1-10 hours by hand. In most of the cases 'ELEPHANT' was easily located
 and the 180 degree rotation was giving another 8-letter portion of the plaintext.
 For hand solution preparing four work sheets with the cryptogram (for each rotation) helps a lot.
 Also marking concentric squares in the grille helps to locate the holes.

 Several people solved it with hill climbing programs.
 a. Here is the program written by Omer Golz and hand solution assistant (EXE, in Delphi).
 b. Here is the hill climber of Roman Dovgard.
 c. Here is the hill climber of Eran Ofek and Boaz Barak (single compressed TAR archive).

3. ADFGX
This was THE problem. An important substitution-transposition product cipher, predecessor of modern
ciphers. It could be solved in several steps:

a. Define the period. Since complete columnar transposition is used, the period divides the message length 308. Thus it can be only 2,4,7,11,14,22,28. The periods of 7,11,14 and 22 look as the most plausible since the rest is either too short or too long (actually in Painvin's case the period could be from 15-25).

b. We reformat the messages with a text editor according to the period guess. Two directions are possible:
the last row for the padding (common ending technique) in both messages. One immediately sees that
for the period of 14 the last row of the 1st message contains only letters X and A. Thus padding character is
either XA or AX. Additional bonus of this discovery is the odd-even column sets. Thus instead of
14! transpositions one is left with 2*7!*7!.  Moreover from the padding of the second message 4 last
columns are identified. The period of 14 was intentionally chosen to avoid brute force solution.

c. Now one can try 7! =5040 possible pairings. A result will be a transposed monoalphabetic cipher which
can be tested very reliably with IC statistics (2*308 letters). The same can be done by hand using the
IC calculator. One needs to check about 7+6+5+4+3+2+1 = 28 pairs of columns. 42 characters (from both
messages are enough to check for monoalphabeticity, leaving only few false alarms. On the other hand
working with a single message (21 letters) is not enough.

d. At this point we are left with a transposed simple substitution cipher. The most popular method
was to use the structure of the ADFGX matrix -- lower rows contain last letters of the alphabet either
in their correct places or slightly shifted forward. The upper rows which contain the keyword tend to have
more frequent letters then the lower rows. This together with a letter frequency data of the messages
was enough.
 

Solution Without Padding

Extensive padding leaked lots of information about the key. How one can analyze this cipher
if no-padding or random padding is used?

a. Guess the period
b. Ignore the last row of the text which contains padding and spoils the frequencies
c. Note that odd and even columns have different frequencies of letters A,D,F,G,X.
Letter X is much more frequent as the second letter then the first one. In general
nothing prohibits us to calculate IC statistic based only on five letters A,D,F,G,X.
If this is done for both messages glued together one immediately identifies the
odd and even columns (for example: 16,5,6,6,12 or 15,7,8,6,9 for some columns and
16,12,5,6,5 or 17,10,2,11,5 for the others). 42 characters are enough in order to do it.
Thus one can identify the odd and even columns. The rest is as in the previous solution.

Note that if the period is odd then first and last letters of the pair appear in checkerboard-like
pattern and not in columns. This  is slightly harder for the first phase of the analysis but
leaks more information about the transposition at a later stage.
 

4A.Enigma - Method of Batons

This was straightforward: just follow the explanations given in the class + use the handout
from Deavours, Kruh book. It is important to work with absolute (keyboard) alphabet and not
with rotor-related alphabet. Most of the class has solved it.

4B.Enigma - Discover the wires

This was another (together with 3)  very important problem in this homework. After I posted this problem
I found that Turing in his notes addressed similar problem (he assumed the reflector is also unknown).

The most common solution was to use the powerful Polish methods which were shown in the class.
Solutions ranged from 195-260 queries.

Several important things to add
1. The number of queries can be as low as 84 = 4*12+3*12. One needs 4 alphabets to write
the equations for R1. The two final equations provide 26 variants for the R1. The indicator
setting has to be changed to, say 'AAA' 26 times.  One queries:
AAAA
BBBB
CCCC
.....
In order to find R2 one can reuse 26 queries already made for R1, and thus needs 3*26 new
queries.

2. There is no need to make 26 queries, 13 are enough (involutions). 12 are enough, since the last
pair is just there.

3. Once received 26 variants of rotor R1 we can find the correct one that corresponds
to our indicator setting. Using no additional queries and applying "baton"-like method.
This was impossible in Rejewski's case since plugboard and ring setting were unknown. In our case there
is no plugboard, so we are exactly in the case fit for "method of batons" attack.
Without finding the correct version of R1 it is hard to continue the attack!

4. After R1 and R2 are found, since reflector R is known we just write down all the possibilities
for R3. No additional queries. In the worst case there will be 2*3^8 =13122 possible solutions. Only three - four additional queries can fix a single correct solution.

5. So the number of indicator changes is about 4*13 = 52, the number of queries 84+3=87.
Armed with Shannon's theory we can calculate the lower bound: the amount of information
in three rotors is  3 log(26!) = 56 letters (here log is to basis 26).

Another Method

Uri Barenholz has invented a method of "Probing" to recover the rotor wires. The number of queries is not
as small as in the method above (although I believe that many refinements are possible) however it is a very  interesting method. It works only for plugless Enigma.



Last modified on 06.05.2000.