Weak Keys in RC4 (One Week Later…)

This is the continuation from the previous post.

From: andrewr@vironix.co.za (Andrew Roos)
Cryptanalysis of RC4 – Preliminary Results


Hi c’punks & sci.cryptites

About a week ago I posted a message about weak keys in RC4. This is
an update on the results of my continued 4am sessions with RC4 and
shows that certain weak keys lead to an almost-feasible known
plaintext attack on the cipher (well, about as feasible as the
differential attack on DES, shall we say).

The attack is based on two particularly interesting three-byte key
prefixes which have a high probability of producing PRNG sequences
which start with a known two-byte sequence. The prefixes are:

1.  Keys starting with "00 00 FD" which have a 14% probability of
    generating sequences which start "00 00".

2.  Keys starting with "03 FD FC" which have a 5% probability of
    generating sequences which start "FF 03".

Note that the expected frequency of any two-byte output sequence is
1 in 65536 or about 0.0015%, so these key prefixes are highly
unusual. I won’t go into the reasons why in this post, since it
follows the same reasoning as my last post, but these prefixes are
special in that they have a high probability of initializing the RC4
state table in such a way that the first two generated bytes depend
only on the first three entries in the state table.

This observation is the basis for a simple known-plaintext attack
which reduces the effective key space which you need to search to
have a 50% probability of discovering a key by about 11.2 bits. The
down side is that you need “quite a few” known plaintexts to make the
attack feasible.

It works as follows:

  1. Collect a large number of known plaintexts (and hence known generator sequences).
  2. Discard generator sequences which do not start with “00 00” or “FF 03”.
  3. For generator streams starting “00 00”, search all keys which begin with “00 00 FD”.
  4. For generator streams staring “FF 03”, search all keys which begin with “03 FD FC”.
  5. Keep going until you find a key 🙂

Clearly this attack will only discover a small fraction of the keys.
However since most generator sequences are discarded without being
searched, and for those which are searched the search is 2^24 smaller
than would be required to search the entire keyspace, the number of
trials required to determine a key is significantly lower than for
brute force alone.

Enough of an intro, here are the relevant results. Forgive my
simplistic approach to maths, I’m a philosopher-come-software
developer, not a mathematician. I’ve run the relevant simulations
with 40-bit, 64-bit, 80-bit and 128-bit key lengths, and with two
different PRNGs. For the sake of consistency with my earlier paper
I’ll use the figures gathered for 80-bit keys (this seems to be RSA’s
preferred key length for RC4), but there are no significant
differences for other key lengths. The PRNG used for these tests was
L’Ecuyer’s 32-bit combined linear congruential generator as described
in “Applied Cryptography” p. 349.

(a) Out of one million trials, keys starting with "00 00 FD"
    generated sequences starting "00 00" 138217 times, and keys
    starting with "03 FD FC" generated output sequences starting "FF
    03" 50490 times.

(b) Out of ten million trials, arbitrary pseudo-random keys generated
    sequences starting with "00 00" 446 times, and sequences starting
    with "FF 03" 146 times. (Note the abnormally high incidence of
    "00 00"; the expected mean is 152.8).

Suppose we have the output stream generated by a randomly chosen key.
The chance that it will start with either “00 00” or “FF 03”, and
that we will therefore search it, is:

    (446 + 146) / 1e7 = 5.92e-5

The chance that it starts with “00 00” and was generated by a key
starting with “00 00 FD”, or that it starts with “FF 03” and was
generated by a key starting “03 FD FC” – i.e. the chance that we will
search it and be rewarded for our efforts – is:

    (138217 + 50490)/(1e6 * 2^24) = 1.12e-8

The total number of plaintexts required for a 50% chance that we will
discover one of the keys is:

    log(0.5)/log(1 - 1.12e-8) = 61 900 000

Well I did say “quite a few” plaintexts would be necessary 🙂

And the number of plaintexts which you expect to search in order to
find the “right” one is:

    61 900 000 * 5.92e-5 = 3665

Since the total key length is 80 bits, and we are “guessing” 24 of
these, each search requires 2^56 trials. Hence the total number of
trials for a 50% chance of discovering a key is:

    3665 * 2^56 = 2.64e20 = 2 ^ 67.8

Since brute search alone would require 2^79 trials for a 50% chance
of determining the key, this reduces the number of trials by 2^11.2.

The results are essentially identical for all the key lengths I have
tried, and in each case reduce effective key length by about 11.2
bits. So, for example, a 64-bit key would normally require 2^63
trials for 50% chance of solution; this attack reduces the number of
trials to 2^51.8 at the cost of requiring 62 million known plaintexts.

I’m still running simulations to check my maths, and although initial
results are encouraging, I don’t have enough data for it to be
statistically relevant yet (generating all these sets of 62 million
known streams takes time…) So consider this preliminary (again),
and I’ll post the results of my simulations when I have enough

Andrew Roos 

// C++ programmers have class (but not much inheritance)

PGP Fingerprint: F6 D4 04 6E 4E 16 80 59 3A F2 27 94 8B 9F 40 26
Full key at ftp://ftp.vironix.co.za/PGP-keys/AndrewRoos

Version: 2.6.2i


Leave a Reply