This is the continuation from the previous post.

**From:** andrewr@vironix.co.za (Andrew Roos)**
Subject:** Cryptanalysis of RC4 – Preliminary Results

**1995/09/28**

Date:

Date:

**<44ebge$llf@hermes.is.co.za>**

Message-ID:

Message-ID:

**Vironix-ZA**

organization:

organization:

**sci.crypt**

newsgroups:

newsgroups:

-----BEGIN PGP SIGNED MESSAGE-----

Hi c’punks & sci.cryptites

About a week ago I posted a message about weak keys in RC4. This is

an update on the results of my continued 4am sessions with RC4 and

shows that certain weak keys lead to an almost-feasible known

plaintext attack on the cipher (well, about as feasible as the

differential attack on DES, shall we say).

The attack is based on two particularly interesting three-byte key

prefixes which have a high probability of producing PRNG sequences

which start with a known two-byte sequence. The prefixes are:

1. Keys starting with "00 00 FD" which have a 14% probability of generating sequences which start "00 00". 2. Keys starting with "03 FD FC" which have a 5% probability of generating sequences which start "FF 03".

Note that the expected frequency of any two-byte output sequence is

1 in 65536 or about 0.0015%, so these key prefixes are highly

unusual. I won’t go into the reasons why in this post, since it

follows the same reasoning as my last post, but these prefixes are

special in that they have a high probability of initializing the RC4

state table in such a way that the first two generated bytes depend

only on the first three entries in the state table.

This observation is the basis for a simple known-plaintext attack

which reduces the effective key space which you need to search to

have a 50% probability of discovering a key by about 11.2 bits. The

down side is that you need “quite a few” known plaintexts to make the

attack feasible.

It works as follows:

- Collect a large number of known plaintexts (and hence known generator sequences).
- Discard generator sequences which do not start with “00 00” or “FF 03”.
- For generator streams starting “00 00”, search all keys which begin with “00 00 FD”.
- For generator streams staring “FF 03”, search all keys which begin with “03 FD FC”.
- Keep going until you find a key 🙂

Clearly this attack will only discover a small fraction of the keys.

However since most generator sequences are discarded without being

searched, and for those which are searched the search is 2^24 smaller

than would be required to search the entire keyspace, the number of

trials required to determine a key is significantly lower than for

brute force alone.

Enough of an intro, here are the relevant results. Forgive my

simplistic approach to maths, I’m a philosopher-come-software

developer, not a mathematician. I’ve run the relevant simulations

with 40-bit, 64-bit, 80-bit and 128-bit key lengths, and with two

different PRNGs. For the sake of consistency with my earlier paper

I’ll use the figures gathered for 80-bit keys (this seems to be RSA’s

preferred key length for RC4), but there are no significant

differences for other key lengths. The PRNG used for these tests was

L’Ecuyer’s 32-bit combined linear congruential generator as described

in “Applied Cryptography” p. 349.

(a) Out of one million trials, keys starting with "00 00 FD" generated sequences starting "00 00" 138217 times, and keys starting with "03 FD FC" generated output sequences starting "FF 03" 50490 times. (b) Out of ten million trials, arbitrary pseudo-random keys generated sequences starting with "00 00" 446 times, and sequences starting with "FF 03" 146 times. (Note the abnormally high incidence of "00 00"; the expected mean is 152.8).

Suppose we have the output stream generated by a randomly chosen key.

The chance that it will start with either “00 00” or “FF 03”, and

that we will therefore search it, is:

(446 + 146) / 1e7 = 5.92e-5

The chance that it starts with “00 00” and was generated by a key

starting with “00 00 FD”, or that it starts with “FF 03” and was

generated by a key starting “03 FD FC” – i.e. the chance that we will

search it and be rewarded for our efforts – is:

(138217 + 50490)/(1e6 * 2^24) = 1.12e-8

The total number of plaintexts required for a 50% chance that we will

discover one of the keys is:

log(0.5)/log(1 - 1.12e-8) = 61 900 000

Well I did say “quite a few” plaintexts would be necessary 🙂

And the number of plaintexts which you expect to search in order to

find the “right” one is:

61 900 000 * 5.92e-5 = 3665

Since the total key length is 80 bits, and we are “guessing” 24 of

these, each search requires 2^56 trials. Hence the total number of

trials for a 50% chance of discovering a key is:

3665 * 2^56 = 2.64e20 = 2 ^ 67.8

Since brute search alone would require 2^79 trials for a 50% chance

of determining the key, this reduces the number of trials by 2^11.2.

The results are essentially identical for all the key lengths I have

tried, and in each case reduce effective key length by about 11.2

bits. So, for example, a 64-bit key would normally require 2^63

trials for 50% chance of solution; this attack reduces the number of

trials to 2^51.8 at the cost of requiring 62 million known plaintexts.

I’m still running simulations to check my maths, and although initial

results are encouraging, I don’t have enough data for it to be

statistically relevant yet (generating all these sets of 62 million

known streams takes time…) So consider this preliminary (again),

and I’ll post the results of my simulations when I have enough

data.

________________________________________________________________ Andrew Roos // C++ programmers have class (but not much inheritance) PGP Fingerprint: F6 D4 04 6E 4E 16 80 59 3A F2 27 94 8B 9F 40 26 Full key at ftp://ftp.vironix.co.za/PGP-keys/AndrewRoos -----BEGIN PGP SIGNATURE----- Version: 2.6.2i iQCVAwUBMGrlfmatuqa4OR+lAQF1eQP+IBBmSztAYUpq1q/BjzvYDCbb+Ns0Gi1S u9wTaZOCl32fdp7NSUEQBX39nVJkQZginug56BZXzijRvOx6fl4+z7dmW9jwtE5E YNCOhx+/fHX4psszMyEUTrnza7MYDc4HXlgv743LOD/xvEyU0D5OGgB5fg+lyhAK 6xQ/Zy8JpE8= =BdMn -----END PGP SIGNATURE-----