Hints for HW2
0. On pen&paper solution:
Again there is no penalty for computer
solutions, however all of the homework is solvable
by hand: Grilles up to 12x12 are
a classical hand solution exercise, even without hints.
Vigenere and ADFGX are also within
reach (you can also use the IC programs below).
1. Note that 4.A can be solved by trying all the keys, such solutions will not count.
2. You are welcome to share scripts/programs
that calculate IC (index of coincidence) of a
string and MC (mutual index of
coincidence), and plaintext recognition, etc. I must note
that a perl script (by Jason) and
a C program (by Baruch) for HW1 were very popular among
solvers.
3. Here is a C
program that calculates frequencies and the IC (note that it counts
all the
frequencies, so delete spaces if
they are in your file, here is an example of the test
input). Credit
goes to george
of the ACA site.
4. Here is a set of programs that
perform plaintext recognition (I don't know how good
they are, but you can use the ideas):
(from this point down, it is new,
03.04.2000)
5. By request of some of you [advanced
hill climbing]:
Hill
Climbing Using Genetic Algorithms, by A.J.Bagnall, et.al.
"A
Fast Method of the Cryptanalysis of Substitution Ciphers", by Thomas
Jakobsen
Experienced hill climbers may also
need this or
this.
6. The code I used for Beaufort
encryption/decryption (1st puzzle):
/* One student was misslead by the Beaufor
table at the ACA site.*/
/* Beaufort encryption below is exactly the
same as was shown at */
/* the lecture. Just cipher
= Key[count]-plain;
*/
....
unsigned char plain, cipher;
int count = 0;
int five = 0;
while(!feof(file_handle)){
plain = fgetc(file_handle);
if((plain >= 'A') &&
(plain <='Z')) {
plain =plain
- 'A';
cipher
= Key[count]-plain; if (cipher < 'A') cipher
+=26;
count = (count+1)%
period; printf("%c", cipher); five = (five+1)%5;
if(five ==
0) printf(" ");
}
else if ((plain >= 'a')
&& (plain <='z')) {
plain= plain
-'a';
cipher
= Key[count]-plain; if (cipher < 'A') cipher
+=26;
count = (count+1)%
period; printf("%c", cipher); five = (five+1)%5;
if(five ==
0) printf(" ");
}
}
....
7. Here are several programs donated
to you by Baruch:
"Here are 2 programs for grills and one for
Vigenere:
In these two change a const to match grills
of different sizes;
One is the one that puts the grill on the
cyphertext to get the
plaintext (has been improved a little).
Another one is a program that gets a grill
with a few holes and erases the
letters that these holes cover (so if you're
sure you have some holes right,
you can run it to take the letters you can't
use anymore from the
cyphertext).
The last one is a program for finding repeated
strings of given length
in the ciphertext - good for figuring out
the length of the key for
Vigenere cipher (or its Beaufort variant)."
Period
of the Vigenere
Grill
Grill
with removed letters
8. Variance of the IC statistics
(follow up to a question of Leonid):
In general IC of more than 30 letters helps to distinguish between
mono-alphabetic and poly-alphabetic
cases:
Expected number of coincidences in a 30-letter sample:
Random : 30*29*IC(Random )
= 30*29*0,0385 ~ 34
English: 30*29*IC(English)
= 30*29*0,0667 ~ 58
The sigma (square root of the variance) for this case is about 15. Note
that distance of 4*sigma between
the modes is preferable for high reliability.
General formula for variance of the IC statistics for the English text of length N letters:
sigma^2 = (0,004344*N^3 + 0,110448*N^2 -
0,114794*N) /(N^2 * (N-1)^2)
9. Most of the 12x12 grilles contain
the word "elephants"
(9 letters)
(Correction 09.04.2000)
For Grille #6 please use crib: "cautious
drive of"
Grille #7 contains "elephant is" (not
"elephants")
10. Some answers
to questions of Baruch Oxman which may be very usefull to all
(problems 3 and 4.a).
11. The homework problems in the
order of increasing complexity (as I see it):
1, 4.a, 4.b, 2.b, 2.a, 3
12. Some crypto student probably
boroughed
the Enigma machine for doing the homework.
Note that HW2 can be done without
it. If you are responsible for it, please return the machine
to the BP.
(from this point down, it is new,
04.04.2000)
13. In 4.b you may assume that
the Reflector is known. Your aim in 4.b is to minimize the number
of queries to the machine (A query
is a result of pressing the key on the keyboard. You are also alowed
to change the indicator setting,
at each query if you need. However count the number of time you do
it.)
(from this point down, it is new,
05.04.2000)
14. Here are: trigram
counts, conditional probabilities of trigrams, and log -conditional
probabilities
of trigrams presented to you by
Eran Ofek.
(from this point down, it is new,
06.04.2000)
15. Problem 3: The ADFGX
checkerboard in this problem was generated using a pronouncable keyword
(will not necessarily appear
in your dictionary). Like this:
A D F G X
A K E Y W
O
D R D A B
C
F F G H I/J L
G M N P Q
S
X T U V X
Z
It is obvious that the main problem is the unknown transposition. So:
1. guess the period
2. write out the rectangles of the two cryptograms
and ... look at them
your aim is to recover the broken pair columns,
which if done correctly
will show mono-alphabetic substitution.
3. if you still do not have any ideas, re-read
the problem statement carefully
[I know of at least 2 ways of reducing the
number of possible transpositions
drastically, one is just by staring at the messages]
4.Figuring the rest of the transposition is a
pen & paper excercise for about 1 hour,
especially if you use the IC calculating
program given above.
5.If you are stuck at the point of solving the
mono-alphabetic substitution, try
the vowel-consonants separation method. Or, anagramming
to check for frequent
digrams, or any other method.
(new 11.04.2000)
The period of the transposition is more than 4 and less than 22. The
transposition
key is not a word from a dictionary. Remember that if messages are
shorter than
the period the padding symbol is used.
Please mail me the answer of problem 3 as soon as you
get it.