Few Things about CRC

...that you might need to know or maybe just forgot

02-Nov-2010 09:01

This will be a bit long, longer then I initially expected.
First let's have a look at typical table-based CRC-32 implementation:

  1. uint32_t calcCrc32(const uint8_t *buf, size_t len) {
  2.     uint32_t rCrc = 0xffffffff;
  3.     size_t n;
  4.     for (n = 0; n < len; ++n) {
  5.         rCrc = crc_table[(rCrc ^ buf[n]) & 0xff] ^ (rCrc >> 8);
  6.     }
  7.     return rCrc ^ 0xffffffff;
  8. }

This is similar to the implementation that can be found in PNG Specification
But for a few experiments the following implementation will be more suitable. The only difference, is that instead of using constant values of 0xffffffff, we'll pass them as function arguments.

  1. uint32_t crc32(const void* buf, size_t bufLen, uint32_t initial, uint32_t final) {
  2.     const uint8_t* data = (const uint8_t*)buf;
  3.     uint32_t rCrc = initial;
  4.     for (size_t i = 0; i < bufLen; ++i) {
  5.         uint32_t bytesCrc = crcTable[(rCrc ^ data[i]) & 0xff];
  6.         rCrc = (rCrc >> 8) ^ bytesCrc;
  7.     }
  8.     return rCrc ^ final;
  9. }

As you probably know CRC is the remainder polynomial resulting from 'dividing' shifted message's polynomial by some chosen generetor polynomial.
(I'll be talking about 32-bit CRC and I was using the same polynomial that png uses 0xedb88320, but the polynomial itself doesn't change anything that's discussed below.)

The need of preset

Normally to get the remainder crc would be called like: crc32(buffer, buLen, 0, 0). But this causes two problems.
First problem is that if you prepend the message with any number of zeroes it won't change the resulting reminder, because this will prepend message polynomial with some 0 coefficients, and that obviously won't change the reminder itself.
Below is an example. It creates such messages, calculates crcs and prints them.

  1. uint8_t mesg[12] = { 0 };
  2. size_t len;
  3. for (len = 0; len < 10; ++len) {
  4.     if (len) mesg[len-1] = 0;
  5.     mesg[len] = 0xa5;
  6.     c1 = crc32 (mesg, len+1, 0, 0);
  7.     printf ("%2d %#08x\n", len+1, c1);
  8. }

And the results:

  1. 1 0xa6bc5767
  2. 2 0xa6bc5767
  3. 3 0xa6bc5767
  4. 4 0xa6bc5767
  5. 5 0xa6bc5767
  6. 6 0xa6bc5767
  7. 7 0xa6bc5767
  8. 8 0xa6bc5767
  9. 9 0xa6bc5767
  10. 10 0xa6bc5767

This is solved by preseting the CRC, but before we'll get to that, let's discuss the second problem.

The need of post-invert

Instead of calculating the CRC on some buffer and then verifying it, it's easier to append CRC after the message, calculate CRC of this new message, the result should be 0.
(It's quite easy to understand why it's so, but guys at this wiki page already did that, so I'm not going to duplicate it :))
Here's an example:

  1. uint8_t mesg[12] = { 0 };
  2. size_t len;
  3. /* fill the buffer with some dummy data */
  4. for (len = 0; len < 6; ++len)
  5.     mesg[len] = len;
  6.  
  7. /* appending crc */
  8. c1 = crc32 (mesg, 6, 0, 0);
  9. *(uint32_t*)(mesg+6) = c1;
  10.  
  11. printf ("10 %#08x\n", crc32(mesg, 10, 0, 0));

This brings us to the second problem, if we append any zeroes to message constructed as above, the reminder will still be 0.
And because mesg above has size of 12, and it's initially filled with zeroes, the following two lines, will also print 0x00000000 as a resulting CRC

  1. printf ("11 %#08x\n", crc32(poly, 11, 0, 0));
  2. printf ("12 %#08x\n", crc32(poly, 12, 0, 0));

That's also quite easy to understand. If the resulting reminder is 0 it means that the message polynomial was multiple of our generator, so appending any number of zero coefficients will give a new message that is still mutiple of a generator

Post-invert

The second problem is usually 'fixed' by negating CRC result. In this case that will be calling: crc32(buffer, buLen, 0, 0xffffffff)
So let's fix last example by appending negated CRC:

  1. c1 = crc32 (mesg, 6, 0, 0xffffffff);
  2. *(uint32_t*)(mesg+6) = c1;
  3. printf ("inverted crc: %08x\n", c1);

Of course now the calculation of CRC for that new message won't return 0, so we should calculate the CRC for the first 6 bytes and compare it with appended CRC or... we could calculate so called reciprocal polynomial and as before calculate the CRC for whole the message and compare it with our reciprocal

  1. /* calculate reciprocal polynomial */
  2. uint32_t test = 0xffffffff;
  3. c1 = crc32 (&test, sizeof(test), 0, 0xffffffff);
  4. printf ("reciprocal: %08x\n", c1);
  5.  
  6. printf ("10 %#08x\n", crc32(poly, 10, 0, 0xffffffff));
  7. printf ("11 %#08x\n", crc32(poly, 11, 0, 0xffffffff));
  8. printf ("12 %#08x\n", crc32(poly, 12, 0, 0xffffffff));

And here are the results, as you can see the calculated CRCs are no longer the same.

  1. reciprocal: 0x2144df1c
  2. 10 0x2144df1c
  3. 11 0xc622f71d
  4. 12 0xb1c2a1a3

Preset to -1

Now back to the first problem which is usually 'fixed' by preseting CRC to -1 (0xffffffff) instead of 0.
This is equivalent to the calling: crc32(buffer, bufLen, 0xffffffff, 0). The following code is almost identical to the first one presented, the only difference is using the preset value in call to crc32.

  1. uint8_t mesg[12] = { 0 };
  2. size_t len;
  3. for (len = 0; len < 10; ++len) {
  4.     if (len) mesg[len-1] = 0;
  5.     mesg[len] = 0xa5;
  6.     c1 = crc32 (mesg, len+1, 0xffffffff, 0);
  7.     printf ("%2d %#08x\n", len+1, c1);
  8. }

And coresponding results:

  1. 1 0x8b414715
  2. 2 0x189aba67
  3. 3 0xa602718a
  4. 4 0x78077784
  5. 5 0x9f615f85
  6. 6 0xe881093b
  7. 7 0xc42f77e6
  8. 8 0x3c6177f1
  9. 9 0xbf4abc36
  10. 10 0xbac9c0ee

Preset and post-invert

Preset and post-invert are usually used together. First important observation is that 0xffffffff becomes fixed point of a crc32

  1. test = 0xffffffff;
  2. c1 = crc32(&test, sizeof(test), 0xffffffff, 0xffffffff);
  3. printf ("fp : %#08x\n", c1);

0xffffffff is printed to the screen. As you've probably noted this is also reciprocal.
Now take a look at crc32 function, now back here, now back at crc32 function, now back here. Just before returning the value it's xored with 0xffffffff value (or you can say inverted).
That tells us, that after finishing the loop, the value of rCrc variable had to be 0.
That isn't too good, because it basically means, that we can again append any numbers of zeroes, after that value and the result won't change

  1. uint8_t mesg[12] = { 0 };
  2. size_t len;
  3.  
  4. *(uint32_t*)(mesg) = 0xffffffff;
  5. for (len = 4; len < 10; ++len) {
  6.     if (len > 4) mesg[len-1] = 0;
  7.     mesg[len] = 0xa5;
  8.     c1 = crc32 (mesg, len+1, 0xffffffff, 0xffffffff);
  9.     printf ("%2d %08x\n", len+1, c1);
  10. }

Final stuff

...or because we know that reciprocal has the same value, we can append calculated CRC after the message. We need to negate the calculated CRC first. Here's the thing:

  1. memset(poly, 0, sizeof(poly));
  2.  
  3. /* prepare dummy buffer */
  4. for (len = 0; len < 5; ++len)
  5.     poly[len] = len+66;
  6.  
  7. /* add NEGATED crc */
  8. c1 = crc32 (poly, 5, 0xffffffff, 0xffffffff);
  9. *(uint32_t*)(poly+5) = ~c1;
  10.  
  11. printf ("09 %#08x\n", crc32(poly, 9, 0xffffffff, 0xffffffff));
  12. printf ("10 %#08x\n", crc32(poly, 10, 0xffffffff, 0xffffffff));
  13. printf ("11 %#08x\n", crc32(poly, 11, 0xffffffff, 0xffffffff));
  14. printf ("12 %#08x\n", crc32(poly, 12, 0xffffffff, 0xffffffff));
  15. printf ("\n");
  16.  
  17. for (len = 9; len < 12; ++len) {
  18.     if (len > 9) poly[len-1] = 0;
  19.     poly[len] = 0xa5;
  20.     c1 = crc32 (poly, len+1, 0xffffffff, 0xffffffff);
  21.     printf ("%2d %08x\n", len+1, c1);
  22. }

...and the results:

  1. 09 0xffffffff
  2. 10 0xffffffff
  3. 11 0xffffffff
  4. 12 0xffffffff
  5.  
  6. 10 0x5943a898
  7. 11 0x5943a898
  8. 12 0x5943a898