Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
Message Integrity and IAPM

By rustball in Technology
Fri Mar 07, 2003 at 04:53:54 PM EST
Tags: Security (all tags)
Security

When you send an encrypted message, you don't want anyone but the intended recipient to be able to read it. But there's another requirement that's often overlooked: you don't want anyone to be able to tamper with it either. Intuitively, it seems as if tampering with an encrypted message should be impossible, but unless the encryption scheme includes specific protection against tampering, you may be able to achieve all sorts of evil things.


A new technique, developed by IBM, may allow us to protect our messages against tampering for little more than the cost of encryption - but only if they'll let us use it.

Let's suppose you're writing a banking application. And suppose you encrypt with the One Time Pad, the only cipher proven secure. To make the illustration simpler, we'll pretend your message, and your key, consist entirely of decimal digits:

03 123 654 000123
means

  1. - a code meaning transfer (that I just made up)
  2. - from account 123
  3. - to account 654
  4. - the sum of $1.23
Now we encrypt with the one-time pad:
  1. - the key
  2. - the plaintext message
  3. - ciphertext: digit-wise sum (ignore carry)
Of course I generated the key by banging on the keyboard, but a real one-time pad would be generated in a secure way and used only once.

Decryption is simple - just subtract.

  1. - the ciphertext
  2. - the key
  3. - the plaintext: digit-wise subtraction (ignore carry)
This method is perfectly secure, literally, as far as confidentiality is concerned. But what about security against tampering?

It turns out that if you hold account 654, and you can tamper with this message in transit, you can change things very much to your advantage. Just add a 9 to the 6th digit from the end.

  1. - ciphertext before tampering
  2. - tamper string
  3. - new ciphertext (digit-wise sum)
When we decrypt this tampered message, the plaintext is rather different:
  1. - new ciphertext (digit-wise sum)
  2. - the key
  3. - the plaintext: digit-wise subtraction (ignore carry)
  4. - transfer
  5. - from account 123
  6. - to account 654
  7. - the sum of $9001.23
What can you do? One solution might be to use digital signatures. If every message is signed with the private key of the sender, it's integrity will be preserved by the signature scheme. This works, but it's massive overkill: digital signatures are very slow to compute. In general, when you've got a shared secret with the other party, you want to use symmetric, secret key techniques rather than assymmetric, public key techniques, since they tend to be more efficient in every way. And the secret key equivalent of a digital signature is a Message Authentication Code, or a MAC.

A MAC is a very short (perhaps 64 bit) addenum to a message that verifies it hasn't been tampered with. To generate, or verify the MAC, you need the MAC secret. A randomly generated message will have only one chance in 2^64 of passing the MAC check. MACs are very like message digest functions (otherwise known as hash functions; MD5 and SHA are examples) except with a secret ingredient that you need to generate or verify the digest. Indeed, there are simple ways to use a function like SHA to make a MAC simply by sprinkling in a secret along with the data being hashed. If you need encryption and message integrity, just append a MAC to the message, then encrypt the whole thing.

But for some environments, like smart cards, you might find that the only cryptographic primitive you've got room for is the AES block cipher. NIST standardised on a block cipher because block ciphers are to cryptologists what towels are to hitch hikers - you can turn them to useful objects in pretty much any crypto situation. And it's pretty simple to MAC with a block cipher. Divide the message into blocks M_1 .. M_n; start with a start value unique to this message R, encrypt it with the block cipher to make X_0 = E(R), then make X_i = E(X_(i-1) XOR M_i), and use X_n as your MAC.

This kind of MAC is known as CBC-MAC - because it's exactly the same as CBC mode encryption, except that you throw away all but the last block when you've finished. This means that when you MAC and encrypt, you're effectively doing the work of encrypting the message twice. It's occured to quite a few people that there must be a way of doing the work only once, and getting both encryption and message integrity out of one block cipher pass over the message. But it's not as easy as it looks, and lots of attempts to solve the problem have been broken.

It now seems that the problem is solved. NIST put out a call to help create standard chaining modes to go along with the standard block cipher, and one of the submissions - by Charinjit S Jutla of IBM - proposes chaining modes (IACBC, IAPM) that give both encryption and integrity securly for little more than the cost of one alone. A flurry of new activity has resulted, and now there are several excellent chaining modes with this property.

Unfortunately, it seems likely that all of these techniques will be subject to a patent held by IBM. Unless NIST can persuade IBM to give the world a royalty- free license to these techniques in return for the credit of being the name behind a national standard, it seems we may be faced with a choice between giving IBM ownership of our protocols, or having our encryption run at half the speed we could.

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o Also by rustball


Display: Sort:
Message Integrity and IAPM | 47 comments (36 topical, 11 editorial, 0 hidden)
Patents and Standards (4.00 / 1) (#3)
by creo on Fri Mar 07, 2003 at 10:42:55 AM EST

I had a quick look at the NIST site describing the mode process, which included details of their process, the workshops and the various modes up for submission.

Interestingly only only proposals with an IP tab are the IBM ones. IANAL, and I dont even play at being one, but the letter talks about how the submission is covered by the IBM patent and then goes on with some legalese about non exclusive license and reasonable terms and conditions. Is this the RAND phrase I hear so much?

There are other parallelizing strategys for MACS proposed. I browsed one (and promptly fried my brain with all the math and shit in there) but made a bee line for the IP section. In this the author claimed that he was unaware of any encumberance, and was willing to grant unrestricted use for private and educational purposes, otherwise RAND.

What I don't get is how can you make something a standard where royalties are to be collected? To me, that automatically disqualifies a process from being standards accepted. The whole point of a standard is it is exactly that - something that everyone can benchmark to, and knows what they are getting. Anyway I'm going to stop ranting and get back to work.

The site is very interesting, and I strongly recommend that you point it out in your article. It is CSRC Modes of Operation. I hope to read some more of the papers over the weekend.

Just a quick thought before I go. The paper I did read thanked a lot of people who had helped the author. Quite a few of these people also had submissions up. So it sounds like a pretty closed bunch, so how do they work out who came up with what? Anyway, must get back to work.

Cheers
Creo.

What's wrong with including a CRC in the encrypted (4.50 / 4) (#4)
by criquet on Fri Mar 07, 2003 at 10:49:20 AM EST

data?

I have to encrypt "03 123 654 000123". I create a CRC of that string and include it in the string to encrypt. I can also include a means to distribute the CRC thoughout the string being encrypted.

When decrypting the string, I extract the CRC and verify that the decrypted string produces the same CRC.

Won't Work (5.00 / 1) (#7)
by creo on Fri Mar 07, 2003 at 11:25:41 AM EST

The MAC is used for 2 purposes - message integrity and authenticity. CRC addresses the first point, but not the second. To generate a MAC you and the receiver both need the key used (I'm assuming block cipher, such as the CBC MAC described above). Thus only you and the other party can verify the MAC.

Example. You have an ATM communicating to a Host. Using a CRC you can verify that the message the message received was the one sent - however you have no way of knowing that the ATM sent the message or if it was sent by Suzy Spoofer, trying to imitate the ATM. The ATM will have the loaded key and thus you know that the message has been generated by the keyholding device.

I did not explain that too well, but I think you get the idea. Having just reread your post I can see that you are saying embed the CRC inside the ciphertext. This still does not solve the problem of authentication. Also encrypting the entire message is unusual, although some institutions do it. It may not be common knowledge, but nearly all interchange and device trafiic is unencrypted - usually only the PIN blocks are encrypted.

Cheers
Creo.

[ Parent ]

Absolutely right. (none / 0) (#11)
by criquet on Fri Mar 07, 2003 at 11:55:04 AM EST

Yep, I overlooked the incredibly important authentication aspect.

[ Parent ]
not understanding (none / 0) (#46)
by kubalaa on Sat Mar 08, 2003 at 03:17:25 PM EST

If the MAC is encrypted inside the cyphertext, then nobody could generate it without having the encryption key. Isn't "having the encryption key" authentication enough, since we're talking about symmetric encryption? If Suzy Spoofer can encrypt the message, she can probably just as easily get ahold of the MAC secret and encrypt the MAC with that.

In other words, the MAC system just means you symmetrically encrypt the CRC. Whether you use the same key as the rest of the message, or a separate key, is equally secure (theoretically).

[ Parent ]

CRC (none / 0) (#8)
by Bad Harmony on Fri Mar 07, 2003 at 11:36:41 AM EST

A CRC is an error check, not an authentication check. If the bad guy knows the type of CRC being used, he can compute a modification to the message that preserves the CRC value of the message. As an example, say that our message contains a 16-digit credit card number. The error check is to divide that 16-digit number by 997, using the remainder as the check digits. The bad guy can change the credit card number to any number that is congruent to the original number modulo 997.

5440' or Fight!
[ Parent ]

Only if they can break the encryption. (none / 0) (#9)
by criquet on Fri Mar 07, 2003 at 11:44:29 AM EST

The CRC is encrypted within the message so there is no way to tamper with it even knowing the CRC algorithm.

[ Parent ]
Encrypted CRC (none / 0) (#10)
by Bad Harmony on Fri Mar 07, 2003 at 11:50:28 AM EST

The point is that the bad guy does not need to modify the encrypted CRC. He modifies the message in such a way that the modified message has the same CRC as the original message. He does not need to know the encryption key or the CRC value of the message.

5440' or Fight!
[ Parent ]

Maybe I'm still missing something (5.00 / 1) (#13)
by criquet on Fri Mar 07, 2003 at 12:01:58 PM EST

Here's my encrypted data that includes the CRC for the original unencrypted data: A343FFC087D11D

How do you modify that so that it affects the embedded, encrypted data and CRC in such a way as to ensure that, when decrypted, the CRC still matches the data?

[ Parent ]

My Mistake (none / 0) (#15)
by Bad Harmony on Fri Mar 07, 2003 at 12:42:58 PM EST

I think we are talking about two different things. The traditional way to add a message authentication check to a message is to compute a function on the plaintext of the message, and append the encrypted value of the function to the plaintext message. The receiver computes the same function on the plaintext of the received message and compares the value to the decrypted message authentication code. The problem with the CRC function is that it is easy to modify the plaintext of a message in such a way that the value of the CRC function for the modified message is the same as it is for the original message.

5440' or Fight!
[ Parent ]

Identical CRCs (none / 0) (#17)
by czth on Fri Mar 07, 2003 at 01:25:32 PM EST

You maintain that it is easy to modify the plaintext of a message in such a way that the value of the CRC function for the modified message is the same as it is for the original message. I say that while it may be possible but I doubt that it is easy; furthermore, it's hard to have the new message be valid.

The (base 64) MD5 digest for "this is a test" is "VLDFjHzp8qi1UTURAu4JOA". To prove your point, kindly come up with another valid message (here I'll define "valid" as "English text") whose digest is also "VLDFjHzp8qi1UTURAu4JOA", and show how you did it. For verification, I used the following command to get the digest:

perl -MDigest::MD5=md5_base64 -wle 'print md5_base64("this is a test")'

Thank you.

czth

[ Parent ]

MD5 (none / 0) (#18)
by Bad Harmony on Fri Mar 07, 2003 at 01:37:35 PM EST

MD5 is not a CRC algorithm. A CRC produces check bits by dividing the message polynomial by the CRC polynomial and using the remainder as the check bits. The shift register seen in hardware implementations of the CRC algorithm is a hardware divider circuit.

5440' or Fight!
[ Parent ]

MD5 and CRC are both hashing algorithms (none / 0) (#20)
by fluffy grue on Fri Mar 07, 2003 at 02:34:40 PM EST

Specifically, they're hashing algorithms used to verify the integrity of a message. He's playing fast and loose with the term "CRC," like many people do; wherever he says "CRC," replace it with "data integrity check."

And his point is that you can use md5 or crc for the same sort of message integrity check as what this article proposes.
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

I disagree. (none / 0) (#22)
by sllort on Fri Mar 07, 2003 at 03:05:41 PM EST

Hash algorithms provide a data integrity check, not an authentication check. Read this comment. An authenticated stream cipher is a legitimate advancement.
--
Warning: On Lawn is a documented liar.
[ Parent ]
True (none / 0) (#37)
by fluffy grue on Fri Mar 07, 2003 at 09:03:47 PM EST

But a hash can be combined with a private key to form authentication. For example, your message looks like: [data][hash'] where hash' is a hash comprised of the data, a randomly-generated session ID (known to both parties), and the private key (known to both parties). (Obviously it has to be shared between the sender and the recipient though.) I believe battle.net uses something like that for its password check.
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

Lots of math... (none / 0) (#40)
by sllort on Fri Mar 07, 2003 at 10:25:33 PM EST

Isn't one of the preconditions of the problem they were trying to solve that you weren't allowed to perform multiple operations on the same data? I thought the whole idea was that you didn't compute a hash on the data and then compute the encryption on the same data, doubling the compute time. I can't see how hashing the data and combining it with a key and then encrypting it meets the compute-once condition, but maybe I'm missing something...
--
Warning: On Lawn is a documented liar.
[ Parent ]
No... (none / 0) (#43)
by fluffy grue on Sat Mar 08, 2003 at 12:28:10 AM EST

You only have to compute once. You hash the session key and password along with the text. You don't have a separate hash for the text. That is, the returned message is (to use C++-ish syntax): data + hash(data + sessionKey + authKey).
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

Agreed, but then you have to encrypt. (n/t) (none / 0) (#47)
by sllort on Mon Mar 10, 2003 at 10:35:58 AM EST


--
Warning: On Lawn is a documented liar.
[ Parent ]
CRCs (none / 0) (#25)
by Bad Harmony on Fri Mar 07, 2003 at 03:54:21 PM EST

CRCs are designed to be error-detection codes. Any usefulness as a hashing algorithm is a side effect. Their mathematical structure makes them a poor choice for use in authentication codes, just like maximal-length LFSRs are a poor choice for key generators in encryption systems.

5440' or Fight!
[ Parent ]

Picky... (none / 0) (#30)
by czth on Fri Mar 07, 2003 at 05:09:40 PM EST

Yeah, the other guy's right, I'm playing fast and loose with the definition of CRC; of course I know the difference, but given a large enough CRC, you'd have a hard time making another valid message that would have the same CRC as the first. I'd wager you'd even have a little trouble doing that with "standard" CRC-32; if you can fulfil the original request but using CRC-32 instead of MD5, that would be fairly impressive.

czth

[ Parent ]

news? (none / 0) (#5)
by cronian on Fri Mar 07, 2003 at 11:03:37 AM EST

Isn't this basically the same thing as PGP signatures? PGP already has an algortihm where the authenticity can be verified with the public key.

We perfect it; Congress kills it; They make it; We Import it; It must be anti-Americanism
Not really (none / 0) (#14)
by Miniluv on Fri Mar 07, 2003 at 12:09:11 PM EST

Its the same concept, but a much more efficient implementation. Also, PGP allows you to sign unencrypted text, whereas this is only for encrypted text.

The point is that in this implementation you encrypt the text at the same time that you generate the digital signature, rather than encrypting and then signing, or vice versa.

"Too much wasabi and you'll be crying like you did at the last ten minutes of The Terminator" - Alton Brown
[ Parent ]

Technique is bollocks, writeup is good, +1 (none / 0) (#6)
by gordonjcp on Fri Mar 07, 2003 at 11:15:21 AM EST


Give a man a fish, and he'll eat for a day. Teach a man to fish, and he'll bore you rigid with fishing stories for the rest of your life.


Define bollocks? (5.00 / 1) (#36)
by sllort on Fri Mar 07, 2003 at 07:02:08 PM EST

Is that some slang word for awesome? A single pass authenticated stream cipher has been the holy grail of many people for some time...
--
Warning: On Lawn is a documented liar.
[ Parent ]
That'll teach me to reply quickly... (none / 0) (#44)
by gordonjcp on Sat Mar 08, 2003 at 05:36:54 AM EST

The technique is actually a bit useful, but I think the limitations placed on the encryption hardware are a bit arbitrary. After all, the PIC microcontrollers can do a fairly rapid Blowfish crypt.

Give a man a fish, and he'll eat for a day. Teach a man to fish, and he'll bore you rigid with fishing stories for the rest of your life.


[ Parent ]
The importance of faster encryption (5.00 / 2) (#12)
by MichaelCrawford on Fri Mar 07, 2003 at 11:56:20 AM EST

Some of you might be asking yourselves why you should get excited at an algorithm that merely doubles the speed of an encryption process, given the availability of 3 GHz Pentium 4 processors, itaniums and the like.

Well, I'll tell you.

Encryption in practice is often performed with embedded processors that have considerably less horsepower than a P4. Power consumption might also be an issue, being able to encrypt with half the work means that it requires half the electricity.

For example, the ARM7TDMI microcontroller that I implemented the AES algorithm (Rijndael) on for FireWire Encrypt runs at 49 MHz, has 64k of flash ROM and just a tiny bit of RAM. The flash ROM has a 16-bit data bus.

The measured disk i/o performance turns out to be completely limited by the encryption/decryption speed, so if I were able to double that, then people could read or write files twice as fast.

That's why, when I was tuning up the encryption, I posted an Ask Kuro5hin asking for help on ARM Assembly Code Optimization.

And it turned out that my hand-optimized assembly turned out to be pretty non-obvious. Most computer science types say you should optimize by improving the algorithm, but what do you do when you're already using the best known algorithm? You write really bizarre assembly, that's what.

Thus I would be pretty stoked to find a better algorithm for Rijndael encryption than what I'm using, which is a method based on lookup tables.


--

Live your fucking life. Sue someone on the Internet. Write a fucking music player. Like the great man Michael David Crawford has shown us all: Hard work, a strong will to stalk, and a few fries short of a happy meal goes a long way. -- bride of spidy


Lookup Tables (none / 0) (#16)
by Bad Harmony on Fri Mar 07, 2003 at 01:15:00 PM EST

I did some work on a software Reed-Solomon encoder for the IBM PC. It used lots of GF(256) arithmetic. I precomputed tables for the common operations. When I tried to optimize the code I found that its performance was being killed by the latency of main memory. The combination of table lookups and reading the input data was blowing out the cache. The CPU was spending most of its time waiting for memory accesses to complete.

5440' or Fight!
[ Parent ]

Making lookup tables faster (none / 0) (#32)
by MichaelCrawford on Fri Mar 07, 2003 at 05:22:51 PM EST

If your lookup table is big enough that it can't all fit in the cache, and the elements of the lookup table are small, you may be able to speed things up by disabling caching for the memory range the lookup table occupies.

That can be enabled by some assembly instructions, but unfortunately these are usually privileged instructions, so it needs operating system support, and the operating system may simply not provide a way to do it. But it could in principle.

In the case of the Rijndael algorithm, the lookup table method really is the best way, because several logical operations that would otherwise require quite a few operations can be folded together into a few table lookups.

Also there is some bit parallelism, where the lookup table elements hold four bytes, where the straightforward way to do it would operate on a byte at a time.

Also I originally did it the straightforward way using the description of the algorithm, without the lookup table optimization. It was astoundingly slow. I was sick to my stomach to see how slow it was - it took twelve hours to initialize a drive (just laying down a filesystem) and Windows would time out before a file of any significant size could copy.

One unfortunate thing for me is that the lookup tables are too big to fit in the RAM of the Oxford 911 chip. I have to put them in the flash, which only has a 16 bit data bus and also has a slower access time than the ram. So the time to load a lookup table element does turn out to be my limiting factor.


--

Live your fucking life. Sue someone on the Internet. Write a fucking music player. Like the great man Michael David Crawford has shown us all: Hard work, a strong will to stalk, and a few fries short of a happy meal goes a long way. -- bride of spidy


[ Parent ]

Followup article? (none / 0) (#21)
by GGardner on Fri Mar 07, 2003 at 03:03:51 PM EST

I enjoyed your original article, asking the ARM question. A follow up article which describes what worked, what didn't work, and how you sped up the program would be very interesting. And it might be good advertising for your consultancy :-)

[ Parent ]
That's a good idea (none / 0) (#31)
by MichaelCrawford on Fri Mar 07, 2003 at 05:15:44 PM EST

I think I do that. However, I won't be able to do it for a while yet because things are really busy now. But I'll start jotting down notes.


--

Live your fucking life. Sue someone on the Internet. Write a fucking music player. Like the great man Michael David Crawford has shown us all: Hard work, a strong will to stalk, and a few fries short of a happy meal goes a long way. -- bride of spidy


[ Parent ]

Include a hash of the message. NT (3.20 / 5) (#33)
by bjlhct on Fri Mar 07, 2003 at 06:36:36 PM EST

Duuuuuuuuhhhhhh.

*
[kur0(or)5hin http://www.kuro5hin.org/intelligence] - drowning your sorrows in intellectualism
Ooops! Hashes won't work at all! (3.66 / 3) (#34)
by sllort on Fri Mar 07, 2003 at 06:59:31 PM EST

Hashes don't provide authentication, they only provide message integrity. Including a hash wouldn't protect you from spoofing attacks. I commend you for thinking creatively though.
--
Warning: On Lawn is a documented liar.
[ Parent ]
Duh again. (2.25 / 4) (#38)
by bjlhct on Fri Mar 07, 2003 at 09:05:55 PM EST

The hash is of the plaintext.

So again, duuuuh.

*
[kur0(or)5hin http://www.kuro5hin.org/intelligence] - drowning your sorrows in intellectualism
[ Parent ]

Oops, RTFA. (5.00 / 2) (#39)
by sllort on Fri Mar 07, 2003 at 10:21:35 PM EST

"If you need encryption and message integrity, just append a MAC to the message, then encrypt the whole thing. But for some environments, like smart cards, you might find that the only cryptographic primitive you've got room for is the AES block cipher."
You forgot to read the article. In the problem they're trying to solve, you don't have room for a hash in the cleartext. You also aren't allowed to do two computations over the same data (CBC-MAC style):
It's occured to quite a few people that there must be a way of doing the work only once, and getting both encryption and message integrity out of one block cipher pass over the message.
So obviously MD5'ing the cleartext violates both the preconditions of the problem. The only way you could use MD5 that would not be a violation of the preconditions would be to MD5 the entire encrypted stream before sending... and that still violates #2, a little. Forgive me for thinking you read the article. It's stunning how many people are rating comments and posting without even reading this article and understanding it. I feel like I'm reading Slashdot again.
--
Warning: On Lawn is a documented liar.
[ Parent ]
There are plenty of fast hashes. (none / 0) (#41)
by bjlhct on Fri Mar 07, 2003 at 11:28:56 PM EST

This isn't a game with the preconditions as the rules. The author was missing some other possibilities when (he) wrote the article. You're missing the forest for the trees.

*
[kur0(or)5hin http://www.kuro5hin.org/intelligence] - drowning your sorrows in intellectualism
[ Parent ]
Or in other words, (none / 0) (#42)
by bjlhct on Fri Mar 07, 2003 at 11:53:55 PM EST

I read the article, but I don't think you understand it.

*
[kur0(or)5hin http://www.kuro5hin.org/intelligence] - drowning your sorrows in intellectualism
[ Parent ]
Message Integrity and IAPM | 47 comments (36 topical, 11 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!