Checksums


[simple_series title=”CRC Tutorial”]

While parity is adequate to protect a single character or byte, its detection capabilities lack in power when applied to larger messages: Messages often span thousands of bits, and if just two bits are flipped, the corruption will not be detectable. And the chance of having multiple bit errors within a message increases exponentially with the message length. [1]For analysis of error behavior, it is often assumed that bit errors are distributed using an independent identical random process. When the chance of any single bit being flipped is p, the chance of a message of size m not being corrupted is (1-p)m, an exponentially decreasing function in the message size. The probability of the message surviving with at most one bit error is p1=(1-p)m-1, resulting in a probability of 1-p1 for getting more than the one error which parity can detect.

Therefore, there is a need for something more powerful. Parity is already a simple form of a checksum (it sums up all the bits in the data, modulo 2). The natural extension is to create a more powerful checksum. Typical versions are summing up the bytes (or 2-/4-byte words) in the message into a checkbyte (or 2-/4-byte checksums). This is again done modulo 256 (or 216 or 232, respectively), which can essentially be implemented by ignoring carry and overflow conditions due to the addition.

Error detection capability

Even though the checksum is much larger than the parity bit and it generally detects many more errors than parity, it is not able to detect all 2-bit modifications to the message: If it clears a bit at position i in one word, but changes the ith bit to one in another word, the sum of these two words and thus the total checksum remains the same. But if two
bits at different positions in the same word (or two different words) are flipped, this error will be detected.

As a consequence of the undetected 2-bit modification above, we can also exchange entire words, especially swap adjacent words, and the result will pass the checksum test. (In effect, both the same-bit-swapping and the word-swapping are due to the commutative nature of the addition and will also apply to any other check method
with commutative properties.)

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.