Chapter 11: Message Authentication Codes

Return to Table of Contents

 

In a chosen-ciphertext attack, the adversary can decrypt arbitrary ciphertexts of its choosing. It would be helpful if there was a way for the decryption algorithm to distinguish between ciphertexts that were honestly generated (by the library itself) and ciphertexts created by the adversary. For example, if the decryption algorithm was able to detect and reject (i.e., raise an error) ciphertexts not created honestly, this could be a promising approach to achieve CCA security.

The problem of determining the origin of information is known as authentication, and it’s a different problem then that of hiding information. For example, we may wish to know whether a ciphertext was created honestly by the library itself, while there is no need to hide the ciphertext.

In the libraries that define CCA security, honestly generated ciphertexts are generated by the libraries with knowledge of the secret key k, and again decrypted by the library with knowledge of the key. We would like a way to prove to the receiver (decryption procedure) that a piece of information (the ciphertext) was created by someone (the encryption procedure) who knows the secret key.

The tool for the job is called a message authentication code, or MAC for short. A MAC scheme consists of two algorithms:

  • KeyGen: samples a secret key for the MAC
  • MAC(k,m): given the secret key k and a plaintext m ∈ ?, output a MAC (sometimes called a “tag”) t. Typically MAC is deterministic.

We will follow the (confusing) convention of terminology, and use the term “MAC” to sometimes refer to the scheme itself, and sometimes just the “tag” associated with the message (i.e., “the MAC of m”).

Think of a MAC as a kind of signature that will be appended to a piece of data. Only someone with the secret key can generate (and verify) a valid MAC. Our intuitive security requirement is that: an adversary who sees valid MACs of many messages cannot produce a forgery — a valid MAC of a different message.

 

11.1: Security Definition

This is a totally new kind of security requirement. It has nothing to do with hiding information. Rather, it is about whether the adversary can actually generate a particular piece of data (e.g., a MAC forgery).

To express such a requirement in a security definition, we need a slightly different approach to defining the libraries. Consider a much simpler statement: “no adversary should be able to guess a uniformly chosen value.” We can formalize this idea with the following two libraries:

MACLibrary1

The left library allows the calling program to attempt to guess a uniformly chosen “target” string. The right library doesn’t even bother to verify the calling program’s guess — in fact it doesn’t even bother to sample a random target string!

Focus on the difference between these two libraries. Their GUESS subroutines give the same output on nearly all inputs. There is only one input r on which they disagree. If a calling program can manage to find that input r, then it can easily distinguish the libraries. Therefore, if the libraries are indistinguishable, it means that the adversary cannot find/generate one of these special inputs! That’s the kind of property we want to express.

Indeed, in this case, an adversary who makes q queries to the GUESS subroutine achieves an advantage of at most q/2λ. For polynomial-time adversaries, this is a negligible advantage (since q is a polynomial function of λ).

 

The MAC security definition. Let’s follow the pattern from the simple example above. We want to say that no adversary can generate a MAC forgery. So our libraries will provide a mechanism to let the adversary check whether it has a forgery. One library will actually perform the check, and the other library will simply assume that a forgery can never happen. The two libraries are different only in how they behave when the adversary calls a subroutine on a true forgery. So by demanding that the two libraries be indistinguishable, we are actually demanding that it is difficult for the calling program to generate a forgery.

 

Definition 11.1: MAC Security

Let Σ be a MAC scheme. We say that Σ is a secure MAC ifΣmac-real ≋ ℒΣmac-fake, where:

SecureMACDefinition1

 

Discussion

  • We do allow the adversary to request MACs of chosen messages, hence the GETMAC subroutine. This means that the adversary is always capable of obtaining valid MACs. However, MACs that were generated by GETMAC don’t count as forgeries — only a MAC of a different message would be a forgery.

For this reason, the ℒmac-fake library keeps track of which MACs were generated by GETMAC, in the set ?. It is clear that these MACs should always be judged valid by VER. But for all other inputs to VER, ℒmac-fake simply answers false.

  • The adversary “wins” by successfully finding any forgery — a valid MAC of any “fresh” message. The definition doesn’t care whether it’s the MAC of any particular meaningful message.

 

11.2: A PRF is a MAC

The definition of a PRF says (more or less) that even if you’ve seen the output of the PRF on several chosen inputs, all other outputs look independently uniformly random. Furthermore, uniformly chosen values are hard to guess. Putting these two observations together, and we’re close to to the MAC definition. Indeed, a PRF is a secure MAC.

 

Claim 11.2

The scheme MAC(k,m) = F(k,m) is a secure MAC, when F is a secure pseudorandom function.

 

11.3: CBC-MAC

A PRF typically supports only messages of a fixed length, but we will soon see that it’s useful to have a MAC that supports longer messages. A classical approach to extend the input size of a MAC involves the CBC block cipher mode applied to a PRF.

 

Construction 11.3: CBC-MAC

Let F be a PRF with in = out = λ. Then for every fixed parameter ℓ, CBC-MAC refers to the following MAC scheme:

 

MACScheme1

 

CBC-MAC differs from CBC encryption mode in two important ways: First, there is no initialization vector. Indeed, CBC-MAC is deterministic (you can think of it as CBC encryption but with the initialization vector fixed to all zeroes). Second, CBC-MAC outputs only the last block.

 

Claim 11.4

If F is a secure PRF with in = out = λ, then for every fixed ℓ, CBC-MAC is a secure MAC for message space ? = {0,1}λℓ.

Note that we have restricted the message space to messages of exactly blocks. Unlike CBC encryption, CBC-MAC is not suitable for messages of variable lengths. If the adversary is allowed to request the CBC-MAC of messages of different lengths, then it is possible for the adversary to generate a forgery (see the homework).

 

11.4: Encrypt-Then-MAC

Our motivation for studying MACs is that they seem useful in constructing a CCA-secure encryption scheme. The idea is to combine a MAC with a CPA-secure encryption scheme. The decryption algorithm can verify the MAC and raise an error if the MAC is invalid. There are several natural ways to combine a MAC and encryption scheme, but not all are secure! (See the exercises.) The safest way is known as encrypt-then-MAC:

 

Construction 11.5: Enc-then-MAC

Let E denote an encryption scheme, and M denote a MAC scheme where E.? ⊆ M.? (i.e., the MAC scheme is capable of generating MACs of ciphertexts in the E scheme). Then let EtM denote the encrypt-then-MAC construction given below:

EncryptThenMACConstruction

Importantly, the scheme computes a MAC of the CPA ciphertext, and not of the plaintext! The result is a CCA-secure encryption scheme:

 

Claim 11.6

If E has CPA security and M is a secure MAC, then EtM (Construction 11.5) has CCA security.

 

Proof

As usual, we prove the claim with a sequence of hybrid libraries:

Chapter10ProofFix1

The starting point is ℒEtMcca-L, shown here with the details of the encrypt-then-MAC construction highlighted. Our goal is to eventually swap mL with mR. But the CPA security of E should allow us to do just that, so what’s the catch?

To apply the CPA-security of E, we must to factor out the relevant call to E.Enc in terms of the CPA library ℒEcpa-L. This means that ke becomes private to the ℒcpa-L library. But ke is also used in the last line of the library as E.Dec(ke,c). The CPA security library for E provides no way to carry out such E.Dec statements!

Chapter10ProofFix2

The operations of the MAC scheme have been factored out in terms of ℒMmac-real. Notably, in the DEC subroutine the condition “t ≠ M.MAC(km,c)” has been replaced with “not VER(c,t).”

Chapter10ProofFix3

We have applied the security of the MAC scheme, and replaced ℒmac-real with ℒmac-fake.

Chapter10ProofFix4

We have inlined the ℒmac-fake library. This library keeps track of a set S of values for the purpose of the CCA interface, but also a set ? of values for the purposes of the MAC. However, it is clear from the code of this library that S and ? always have the same contents.

Therefore, the two conditions “(c,t) ∈ S” and “(c,t) ∉ ?” in the DEC subroutine are exhaustive! The final line of DEC is unreachable. This hybrid highlights the intuitive idea that an adversary can either query DEC with a ciphertext generated by CHALLENGE (the (c,t) ∈ S case) — in which case the response is null — or with a different ciphertext — in which case the response will be err since the MAC will not verify.

Chapter10ProofFix5

The unreachable statement has been removed and the redundant variables S and ? have been unified. Note that this hybrid library never uses E.Dec, making it possible to express its use of the E encryption scheme in terms of ℒcpa-L.

Chapter10ProofFix6

The statements involving the encryption scheme E have been factored out in terms of ℒcpa-L.

 

We have now reached the half-way point of the proof. The proof proceeds by replacing ℒcpa-L with ℒcpa-R, applying the same modifications as before (but in reverse order), to finally arrive at ℒcca-R. The repetitive details have been omitted, but we mention that when listing the same steps in reverse, the changes appear very bizarre indeed. For instance, we add an unreachable statement to the DEC subroutine; we create a redundant variable ? whose contents are the same as S; we mysteriously change one instance of S (the condition of the second if-statement in DEC) to refer to the other variable ?. Of course, all of this is so that we can factor out the statements referring to the MAC scheme (along with ?) in terms of ℒmac-fake and finally replace ℒmac-fake with ℒmac-real.

 

Exercises

11.1: Consider the following MAC scheme, where F is a secure PRF with in = out = λ:

Exercise74

Show that the scheme is not a secure MAC. Describe a distinguisher and compute its advantage.

 

11.2: Consider the following MAC scheme, where F is a secure PRF with in = out = λ:

Exercise50

Show that the scheme is not a secure MAC. Describe a distinguisher and compute its advantage.

 

11.3: Consider the following MAC scheme, where F is a secure PRF with in = out = λ:

Exercise75

Show that the scheme is not a secure MAC. Describe a distinguisher and compute its advantage.

 

11.4: Consider the following MAC scheme, where F is a secure PRF with in = 2λ and out = λ:

Exercise76

In the argument to F, we write imi to denote the integer i (written as a λ-bit binary number) concatenated with the message block mi. Show that the scheme is not a secure MAC. Describe a distinguisher and compute its advantage.

11.5: Suppose we expand the message space of CBC-MAC to ? = ({0,1}λ). In other words, the adversary can request a MAC on any message whose length is an exact multiple of the block length λ. Show that the result is not a secure MAC. Construct a distinguisher and compute its advantage.

Hint: Request a MAC on two single-block messages, then use the result to forge the MAC of a two-block message.

 

11.6: Let E be a CPA-secure encryption scheme and M be a secure MAC. Show that the following encryption scheme (called encrypt & MAC) is not CCA-secure:

Exercise77

Describe a distinguisher and compute its advantage.

 

11.7: Let E be a CPA-secure encryption scheme and M be a secure MAC. Show that the following encryption scheme Σ (which I call encrypt-and-encrypted-MAC) is not CCA-secure:

Exercise78

Describe a distinguisher and compute its advantage.

 

11.8: In Construction 8.4, we encrypt one plaintext block into two ciphertext blocks. Imagine applying the Encrypt-then-MAC paradigm to this encryption scheme, but (erroneously) computing a MAC of only the second ciphertext block.

In other words, let F be a PRP with block length blen = λ, and let M be a MAC scheme for message space {0,1}λ. Define the following encryption scheme:

Exercise54

Show that the scheme does not have CCA security. Describe a successful attack and compute its advantage.

Hint: Obtain valid encryptions (r,x,t), (r’,x’,t’), and (r”,x”,t”) of unknown plaintexts m, m’, m”. Consider what information about the plaintexts is leaked via:

Dec((ke,km), (r’,x,t)) ⊕ Dec((ke,km), (r”,x,t)) ⊕ x’x”.

 

11.9: When we combine different cryptographic ingredients (e.g., combining a CPA-secure encryption scheme with a MAC to obtain a CCA secure scheme) we generally require the two ingredients to use separate, independent keys. It would be more convenient if the entire scheme just used a single λ-bit key.

(a). Suppose we are using Encrypt-then-MAC, where both the encryption scheme and MAC have keys that are λ bits long. Refer to the proof of security in the notes (11.4) and describe where it breaks down when we modify Encrypt-then-MAC to use the same key for both the encryption & MAC components:

Exercise79

(b). While Encrypt-then-MAC requires independent keys ke and km for the two components, show that they can both be derived from a single key using a PRF.  In more detail, let F be a PRF with in = 1 and out = λ. Prove that the following modified Encrypt-then-MAC construction is CCA-secure:

Exercise80

You should not have to re-prove all the tedious steps of the Encrypt-then-MAC security proof. Rather, you should apply the security of the PRF in order to reach the original Encrypt-then-MAC construction, whose security we already proved (so you don’t have to repeat).

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

The Joy of Cryptography OE (1st) Copyright © 2017 by Mike Rosulek is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.

Share This Book