Chapter 4: Basing Cryptography on Limits of Computation

Mike Rosulek, PhD

Chapter 4: Basing Cryptography on Limits of Computation

Return to Table of Contents

John Nash was a mathematician who earned the 1994 Nobel Prize in Economics for his work in game theory. His life was made into a successful movie, A Beautiful Mind.

In 1955, Nash was in correspondence with the United States National Security Agency (NSA),¹ discussing new methods of encryption that he had devised. In these letters, he also proposes some general principles of cryptography (bold highlighting not in the original):

¹The original letters, handwritten by Nash, are available at: http://www.nsa.gov/public_info/press_room/
2012/nash_exhibit.shtml.

We see immediately that in principle the enemy needs very little information to begin to break down the process. Essentially, as soon as r bits² of enciphered message have been transmitted the key is about determined. This is no security, for a practical key should not be too long. But this does not consider how easy or difficult it is for the enemy to make the computation determining the key. If this computation, although possible in principle, were sufficiently long at best then the process could still be secure in a practical sense

The most direct computation procedure would be for the enemy to try all 2^r possible keys, one by one. Obviously this is easily made impractical for the enemy by simply choosing r large enough.

In many cruder types of enciphering, particularly those which are not autocoding, such as substitution ciphers [letter for letter, letter pair for letter pair, triple for triple..] shorter means for computing the key are feasible, essentially because the key can be determined piece meal, one substitution at a time.

So a logical way to classify enciphering processes is by the way in which the computation length for the computation of the key increases with increasing length of the key. This is at best exponential and at worst probably a relatively small power of r, ar² or ar³, as in substitution ciphers.

Now my general conjecture is as follows: For almost all sufficiently complex types of enciphering, especially where the instructions given by different portions of the key interact complexly with each other in the determination of their ultimate effects on the enciphering, the mean key computation length increases exponentially with the length of the key, or in other words, with the information content of the key.

The significance of this general conjecture, assuming its truth, is easy to see. It means that it is quite feasible to design ciphers that are effectively unbreakable. As ciphers become more sophisticated the game of cipher breaking by skilled teams, etc. should become a thing of the past.

²Nash is using r to denote the length of the key, in bits

Nash’s letters were declassified only in 2012, so they did not directly influence the development of modern cryptography. Still, his letters illustrate the most important idea of “modern” cryptography: that security can be based on the computational difficulty of an attack rather than on the impossibility of an attack. In other words, we are willing to accept that breaking a cryptographic scheme may be possible in principle, as long as breaking it would require too much computational effort to be feasible.

We have already discussed one-time-secret encryption and secret sharing. For both of these tasks we were able to achieve a level of security guaranteeing that attacks are impossible in principle. But that’s essentially the limit of what can be accomplished with such ideal security. Everything else we will see in this class, and every well-known product of cryptography (public-key encryption, hash functions, etc.) has a “modern”-style level of security, which guarantees that attacks are merely computationally infeasible, not impossible.

4.1: Polynomial-Time Computation

Nash’s letters also spell out the importance of distinguishing between computations that take a polynomial amount of time and those that take an exponential amount of time.³ Throughout computer science, polynomial-time is used as a formal definition of “efficient,” and exponential-time (and above) is synonymous with “intractible.”

³Nash was surprisingly ahead of his time here. Polynomial-time wasn’t formally proposed as a natural definition for “efficiency” in computation until Alan Cobham, 10 years after Nash’s letter was written (Alan Cobham, The intrinsic computational difficulty of functions, in Proc. Logic, Methodology, and Philosophy of Science II, 1965). Until Nash’s letters were declassified, the earliest well-known argument hinting at the importance of polynomial-time was in a letter from Kurt Gödel to John von Neumann. But that letter is not nearly as explicit as Nash’s, and was anyway written a year later.

In cryptography, it makes a lot of sense to not worry about guaranteeing security in the presence of attackers with unlimited computational resources. Not even the most powerful nation-states can invest 2¹⁰⁰ CPU cycles towards a cryptographic attack. So the modern approach to cryptography (more or less) follows Nash’s suggestion, demanding that breaking a scheme requires exponential time.

Definition 4.1

A program runs in polynomial time if there exists a constant c > 0 such that for all sufficiently long input strings x, the program stops after no more than O(|x|^c) steps.

Polynomial time is not a perfect match to what we mean by “efficient.” Polynomial time includes algorithms with running time Θ(n¹⁰⁰⁰), while excluding those with running time Θ(n^{log log log n}). Despite that, it’s extremely useful because of the following closure property: repeating a polynomial-time process a polynomial number of times results in a polynomial-time process overall.

Security Parameter

The definition of polynomial-time is asymptotic, since it considers the behavior of a computer program as the size of the inputs grows to infinity. But cryptographic algorithms often take multiple different inputs to serve various purposes. To be absolutely clear about our “measuring stick” for polynomial time, we measure the efficiency of cryptographic algorithms (and adversaries!) against something called the security parameter, which is the number of bits needed to represent secret keys and/or randomly chosen values used in the scheme. We will typically use λ to refer to the security parameter of a scheme.

It’s helpful to think of the security parameter as a tuning knob for the cryptographic system. When we dial up this knob, we increase the size of the keys in the system, the size of all associated things like ciphertexts, and the required computation time of the associated algorithms. Most importantly, the amount of effort required by the honest users grows reasonably (as a polynomial function of the security parameter) while the effort required to violate security increases faster than any (polynomial-time) adversary can keep up.

Potential Pitfall: Numerical Algorithms

The public-key cryptographic algorithms that we will see are based on problems from abstract algebra and number theory. These schemes require users to perform operations on very large numbers. We must remember that representing the number N on a computer requires only ┌log₂ (N+1)┐ bits. This means that ┌log₂ (N+1)┐, rather than N, is our security parameter! We will therefore be interested in whether certain operations on the number N run in polynomial-time as a function of ┌log₂ (N + 1)┐, rather than in N. Keep in mind that the difference between running time O(log N) and O(N) is the difference between writing down a number and counting to the number.

For reference, here are some numerical operations that we will be using later in the class, and their known efficiencies:

By “efficient,” we mean polynomial-time. However, all of the problems in the right-hand column do have known polynomial-time algorithms on quantum computers.

4.2: Negligible Probabilities

In all of the cryptography that we’ll see, an adversary can always violate security simply by guessing some value that was chosen at random, like a secret key. However, imagine a system that has 1024-bit secret keys chosen uniformly at random. The probability of correctly guessing the key is 2⁻¹⁰²⁴, which is so low that we can safely ignore it.

We don’t worry about “attacks” that have such a ridiculously small success probability. But we would worry about an attack that succeeds with, say, probability 1/2. Somewhere between 2⁻¹⁰²⁴ and 2⁻¹ we need to find a sensible place to draw the line. In the same way that polynomial time formalizes “efficient” running times, we will use an asymptotic definition of what is a negligibly small probability.

Consider a scheme with keys that are λ bits long. Then a blind guess is correct with probability 1/2^λ. Now what about an adversary who makes 2 guesses, or λ guesses, or λ⁴² guesses? Such an adversary would still run in polynomial time, and might succeed in its attack with probability 2/2^λ , λ/2^λ , or λ⁴²/2^λ.

Our starting point here is 1/2^λ, and the nice thing about that probability is that, no matter what polynomial we place on top, we still get a very small probability indeed. This idea gives rise to our formal definition:

Definition 4.2: Negligible:

A function ? is negligible if, for every polynomial p, we have lim_λ→∞ p(λ)?(λ) = 0.

In other words, a negligible function approaches zero so fast that you can never catch up when multiplying by a polynomial. This is exactly the property we want from a security guarantee that is supposed to hold against all polynomial-time adversaries. If a polynomial-time adversary succeeds with probability ?, then running the same attack p times would still be an overall polynomial-time attack (if p is a polynomial), and potentially have success probability p · ?.

Example:

The function ?(λ) = 1/2^λ is negligible, since for any polynomial p(λ) = λ^c, we have:

since clog(λ) − λ approaches −∞ in the limit, for any constant c.

Similarly, functions like 1/2^λ/2, 1/2^{log² λ}, and 1/n^{log n} are all negligible.

In this class, when we see a negligible function, it will typically always be one that is easy to recognize as negligible (just as in an undergraduate algorithms course, you won’t really encounter algorithms where it’s hard to tell whether the running time is polynomial).

Definition 4.3: (f ≈ g)

If ?,g : ℕ → ℝ are two functions, we write ? ≈ g to mean that |?(λ) − g(λ)| is a negligible function.

We use the terminology of negligible functions exclusively when discussing probabilities, so the following are common:

⁴Pr[A] ≈ Pr[B] doesn’t mean that events A and B almost always happen together (when A and B are defined over a common probability space) — imagine A being the event “the coin came up heads” and B being the event “the coin came up tails.” These events have the same probability but never happen together. To say that “A and B almost always happen together,” you’d have to say something like Pr[A ⊕ B] ≈ 0, where A ⊕ B denotes the event that exactly one of A and B happens.

Additionally, the ≈ symbol is transitive:⁵ if Pr[X] ≈ Pr[Y] and Pr[Y] ≈ Pr[Z], then Pr[X] ≈ Pr[Z] (perhaps with a slightly larger, but still negligible, difference).

⁵It’s only transitive when applied a polynomial number of times. So you can’t define a whole series of events X_i, show that Pr[X_i] ≈ Pr[X_i+1], and conclude that Pr[X₁] ≈ Pr[X₂n]. It’s rare that we’ll encounter this subtlety in this course.

4.3: Indistinguishability

So far we have been writing formal security definitions in terms of interchangeable libraries, which requires that two libraries have exactly the same effect on every calling program. Going forward, our security definitions will not be quite as demanding. First, we only consider polynomial-time calling programs; second, we don’t require the libraries to have exactly the same effect on the calling program, only that the effect is negligible.

Definition 4.4: Indistinguishability

Let ℒ_left and ℒ_right be two libraries with a common interface. We say that ℒ_left and ℒ_right are indistinguishable, and write ℒ_left ≋ ℒ_right, if for all polynomial-time programs ? that output a single bit, Pr[? ◊ ℒ_left ⇒ 1] ≈ Pr[? ◊ ℒ_right ⇒ 1].

We call the quantity |Pr[? ◊ ℒ_left ⇒ 1] − Pr[? ◊ ℒ_right ⇒ 1]| the advantage or bias of ? in distinguishing ℒ_left from ℒ_right. Two libraries are therefore indistinguishable if all polynomial-time calling programs have negligible advantage in distinguishing them.

From the properties of the “≈” symbol, we can see that indistinguishability of libraries is also transitive, which allows us to carry out hybrid proofs of security in the same way as before.

And similar to before, we also have a library composition lemma (you are asked to prove this as an exercise):

Lemma 4.5: Composition:

If ℒ_left ≋ ℒ_right then ℒsup>∗ ◊ ℒ_left ≋ ℒ^∗ ◊ ℒ_right for any polynomial-time library ℒ^∗.

Bad-Event Lemma

A common situation is when two libraries carry out exactly the same steps until some exceptional condition happens. In that case, we can bound an adversary’s distinguishing advantage by the probability of the exceptional condition.

More formally, we can state a lemma of Bellare & Rogaway.⁶ We present it without proof, since it involves a syntactic requirement. Proving it formally would require a formal definition of the syntax/semantics of the pseudocode that we use for these libraries.

⁶Mihir Bellare & Phillip Rogaway: “Code-Based Game-Playing Proofs and the Security of Triple Encryption,” in Eurocrypt 2006. ia.cr/2004/331

Lemma 4.6: Bad Events

Let ℒ_left and ℒ_right be libraries that each define a variable bad that is initialized to 0. If ℒ_left and ℒ_right have identical code, except for code blocks reachable only when bad = 1 (e.g., guarded by an “if bad = 1” statement), then

4.4: Sampling with Replacement & the Birthday Bound

Below is an example of two libraries which are not interchangeable (they do have mathematically different behavior), but are indistinguishable. These two libraries happen to be convenient for later topics, as well.

Lemma 4.7: Replacement Sampling

Define two libraries as below

Then for all calling programs ? that make q queries to the SAMP subroutine, the advantage of ? in distinguishing the libraries is at most q(q − 1)/2^λ+1.

In particular, when ? is polynomial-time (in λ), then q grows as a polynomial in the security parameter. Hence, ? has negligible advantage, and ℒ^samp-L ≋ ℒ^samp-R.

ℒ^samp-L uniformly samples λ-bit strings with replacement (i.e., allowing for duplicates). In ℒ^samp-R, we’ve initialized the set R outside of any subroutine, by which we mean that R is a static (its value is maintained across different calls to SAMP) and private (its value is not directly accessible to the calling program). ℒ_samp-R uses the set R to keep track of which values have been previously sampled, to sample without replacement and ensure that there are no duplicate outputs.

The overall idea here is that the only way to distinguish the two libraries seems to be to wait for a repeated output of the SAMP algorithm. But when sampling uniformly from a huge set with 2^λ items, it is very unlikely (but not impossible!) to ever see a repeated value unless one asks for a huge number of samples. For that reason, the two libraries should be indistinguishable. This is an accurate intuition, but it’s not a formal proof. Among other things, we need to determine just how unlikely repeated outputs are, and argue that the distinguishing advantage is related to the probability of repeated outputs.

Proof:

Consider the following hybrid libraries:

First, let us prove some simple observations about these libraries:

ℒ_hyb-L ≡ ℒ_samp-L: Note that ℒ_hyb-L simply samples uniformly from {0,1}^λ. The extra R and bad variables in ℒ_hyb-L don’t actually have an effect on its external behavior (they are used only for convenience later in the proof).

ℒ_hyb-R ≡ ℒ_samp-R: Whereas ℒ_samp-R avoids repeats by simply sampling from {0,1}^λ\ R, this library ℒ_hyb-R samples r uniformly from {0,1}^λ and retries if the result happens to be in R. This method is called rejection sampling, and it has the same effect⁷ as sampling r directly from {0,1}^λ\ R.

⁷The two approaches for sampling from {0,1}^λ\ R may have different running times, but our model considers only the input-output behavior of the library.

We can also see that ℒ_hyb-L and ℒ_hyb-R differ only in code that is reachable only when bad = 1 (highlighted). So, using Lemma 4.6, we can bound the advantage of the calling program:

It suffices to show that in ? ◊ ℒ_hyb-L the variable bad is set 1 only with negligible probability. It turns out to be easier to reason about the complement event, and argue that bad remains 0 with probability negligibly close to 1.

Suppose ? makes q calls to the SAMP subroutine. Let r₁,. . . ,r_q be the responses of these calls. For variable bad to remain 0, it must be the case that for each i, we have r_i ∉ {r₁,. . . ,r_i-1}. Then,

Pr[bad remains 0 in ? ◊ ℒ_hyb-L]

For the inequality we used the convenient fact that when x and y are positive, we have (1 − x)(1 − y) = 1 − (x + y) + xy ⩾ 1 − (x + y). More generally, when all terms x_i are positive, ∏_i (1 − x_i) ⩾ 1 − ∑_i x_i.

Summing up, the advantage of ? in distinguishing ℒ_samp-L and ℒ_samp-R is:

When ? is a polynomial-time distinguisher, q is a polynomial function of λ, so ?’s advantage is negligible. This shows that ℒ_samp-L ≋ ℒ_samp-R.

Birthday Bound

As part of the previous proof, we showed that the probability of repeating an item while taking q uniform samples from {0,1}^λ is O(q²/2^λ). This is an upper bound, but we can come up with a lower bound as well:

Lemma 4.8: Birthday Bound

When taking q uniform samples from {0,1}^λ, where q ⩽ √(2^{λ + 1}), the probability of encountering a repeated value is at least 0.632 * (^q(q−1)/_2^λ+1). In particular, when q = √(2^λ+1), a repeated value is encountered with probability at least 1/2.

Proof:

We use the fact that for x ∈ [0,1], we have 1 − x ⩽ e^−x ⩽ 1 − (0.632…)x, where the constant 0.632 … is 1 −(¹/_e).

Let r₁,… ,r_q be the values uniformly sampled from {0,1}^λ. As above, we fail to encounter a repeated value if the r_i’s are all distinct. Applying the bounds from above, we have:

The second inequality follows from the fact that q(q − 1)/2^λ+1 < 1 by our bound on q. So the probability of a repeated value among the r_i’s is

More generally, when sampling uniformly from a set of N items, taking √(2N) samples is enough to ensure a repeated value with probability -0.63. The bound gets its name from considering q = √(2 · 365) ≈ 27 people in a room, and assuming that the distribution of birthdays is uniform across the calendar. With probability at least 0.63, two people will share a birthday. This counterintuitive fact is often referred to as the birthday paradox, although it’s not an actual paradox.⁸

⁸I think the actual birthday paradox is that the “birthday paradox” is not a paradox.

In the context of cryptography, we often design schemes in which the users take many uniform samples from {0,1}^λ. Security often breaks down when the same value is sampled twice, and this happens after about √(2^{λ + 1}) ∼ 2^λ/2 steps.

Exercises:

4.1: Which of the following are negligible functions in λ? Justify your answers.

4.2: Show that when ? is negligible, then for every polynomial p, the function p(λ)?(λ) not only approaches 0, but it is also negligible itself.

4.3: Prove that the ≈ relation is transitive. Let f,g,h : ℕ → ℝ be functions. Using the definition of the ≈ relation, prove that if ? ≈ g and g ≈ h then ? ≈ h. You may find it useful to invoke the triangle inequality: |a − c| ≤ |a − b| + |b − c|.

4.4: Prove Lemma 4.5.

4.5: A deterministic program is one that uses no random choices. Suppose ℒ₁ and ℒ₂ are two deterministic libraries with a common interface. Show that either ℒ₁ ≡ ℒ₂, or else ℒ₁ & ℒ₂ can be distinguished with advantage 1.

4.6: For this problem, consider the following two libraries:

(a). Prove that the two libraries are indistinguishable. More precisely, show that if an adversary makes q₁ number of calls to AVOID and q₂ calls to SAMP, then its distinguishing advantage is at most q₁q₂/2^λ. For a polynomial-time adversary, both q₁ and q₂ (and hence their product) are polynomial functions of the security parameter, so the advantage is negligible.

(b). Suppose an adversary makes a total of n_i calls to AVOID before making its ith call to SAMP, and furthermore the adversary makes q calls to SAMP overall (so that n₁ ⩽ n₂ ⩽ · · · ⩽ n_q). Show that the two libraries can be distinguished with advantage at least:

4.7: Prove the following generalization of Lemmas 4.7 & 4.8

Fix a value x ∈ {0,1}^λ. Then when taking q uniform samples from {0,1}^λ, the probability that there exist two distinct samples whose XOR is x is at least 0.63q(q − 1)/2^{λ – 1} and at most q(q − 1)/2^{λ + 1^.}

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

Return to Table of Contents

4.1: Polynomial-Time Computation

Definition 4.1

Security Parameter

Potential Pitfall: Numerical Algorithms

4.2: Negligible Probabilities

Definition 4.2: Negligible:

Example:

Definition 4.3: (f ≈ g)

4.3: Indistinguishability

Definition 4.4: Indistinguishability

Lemma 4.5: Composition:

Bad-Event Lemma

Lemma 4.6: Bad Events

4.4: Sampling with Replacement & the Birthday Bound

Lemma 4.7: Replacement Sampling

Proof:

Birthday Bound

Lemma 4.8: Birthday Bound

Proof:

Exercises:

License

Share This Book