Chapter 3: Secret Sharing
Return to Table of Contents
DNS is the system that maps human-memorable Internet domains like irs.gov to machine-readable IP addresses like 166.123.218.220. If an attacker can masquerade as the DNS system and convince your computer that irs.gov actually resides at some other IP address, it’s not hard to imagine the kind of trouble you might be facing.
To protect against these kinds of attacks, a replacement called DNSSEC is slowly being rolled out. DNSSEC uses cryptography to make it impossible to falsify a domain-name mapping. The cryptography required to authenticate DNS mappings is certainly interesting, but an even more fundamental question is, who can be trusted with the master cryptographic keys to the system? The non-profit organization in charge of these kinds of things (ICANN) has chosen the following system. The master key is split into 7 pieces and distributed on smart cards to 7 geographically diverse people, who keep them in safe-deposit boxes.
At least five key-holding members of this fellowship would have to meet at a secure data center in the United States to reboot [DNSSEC] in case of a very unlikely system collapse.
“If you round up five of these guys, they can decrypt [the root key] should the West Coast fall in the water and the East Coast get hit by a nuclear bomb,” Richard Lamb, program manager for DNSSEC at ICANN, told TechNewsDaily.1
How is it possible that any 5 out of the 7 key-holders can reconstruct the master key, but (presumably) 4 out of the 7 cannot? The solution lies in a cryptographic tool called a secret-sharing scheme, the topic of this chapter.
3.1: Definitions:
Definition 3.1: Secret-Sharing
A t-out-of-n threshold secret-sharing scheme (TSSS) consists of the following algorithms:
- Share: a randomized algorithm that takes a message m ∈ ? as input, and outputs a sequence s = (s1,. . . ,sn) of shares.
- Reconstruct: a deterministic algorithm that takes a collection of t or more shares as input, and outputs a message.
We call ? the message space of the scheme, and t its threshold. As usual, we refer to the parameters/components of a scheme Σ as Σ.t, Σ.n, Σ.?, Σ.Share, Σ.Reconstruct.
The correctness property for such a scheme is that any subset of at least t shares is enough to reconstruct the message m. That is, for all sets U ⊆ {1,. . . ,n} with |X| ≥ t and for all s ← Share(m), we have Reconstruct ((si)i ∈ U ) = m.
In typical usage, then shares are distributed to n different users. In the above definition, the set U corresponds to a set of users. If |U| ⩾ t, we say that the set is authorized; otherwise the set is unauthorized.
Just like the encryption definition does not address the problem of key distribution, the secret-sharing definition does not address the problem of who should run the Share algorithm (if its input m is so secret that it cannot be entrusted to any single person), nor of how the shares are meant to be sent securely to the n different users. Those concerns are considered out of scope by the problem of secret-sharing (although we later discuss clever approaches to the first problem). Rather, the focus is simply on whether it is even possible to encode data in such a way that it can be recovered (only) if a threshold number of shares are present.
Security Definition
We’d like a security guarantee that says something like:
if you have an unauthorized set of shares, then you learn no information about the choice of secret message.
To translate this informal statement into a formal security definition, we follow the The Central Principle of Security DefinitionsTM and define two libraries that allow the calling program to learn a set of shares (for an unauthorized set), and that differ only in which secret is shared. If the two libraries are interchangeable, then we conclude that seeing an unauthorized set of shares leaks no information about the choice of secret message. The definition looks like this:
Definition 3.2: TSSS Security
Let Σ be a threshold secret-sharing scheme. We say that Σ is secure if ℒΣtsss-L ≡ ℒΣtsss-R, where:
In an attempt to keep the notation uncluttered, we have not written the type of the argument U , which is U ⊆ {1,. . . ,Σ.n}.
Discussion:
- Similar to the definition of one-time secrecy of encryption, we let the calling program choose the two secret messages that will be shared. As before, this models the fact that an adversary may have partial knowledge or influence over what inputs might be used in the secret-sharing scheme.
- The calling program also chooses the set U of users’ shares to obtain. The libraries make it impossible for the calling program to obtain the shares of an authorized set (returning err in that case).
- Consider a 6-out-of-10 threshold secret-sharing scheme. With the libraries above, the calling program can receive the shares of users {1,. . . ,5} (an unauthorized set) in one call to QUERY, and then receive the shares of users {6,. . . ,10} in another call. It might seem like the calling program can then combine these shares to reconstruct the secret (if the same message was shared in both calls). However, this is not the case because these two sets of shares came from two independent executions of the Share algorithm. Shares generated by one call to Share should not be expected to function with shares generated by another call, even if both calls to Share used the same secret message.
- The calling program sees shares corresponding to the same set of users in both libraries. Recall that interchangeable libraries hide their internal differences. In this case, the set of users is the same in the two libraries, so the security definition does not require shares to hide the identity of the users they belong to. Only the secret message needs to be hidden according to this definition. Indeed, if each user’s identity were appended to his/her corresponding share, it would have no effect on whether ℒΣtsss-L ≡ ℒΣtsss-R.
One could certainly write a security definition that required shares to be “anonymous” in this way, hiding the identity of their associated user. This could be accomplished by having the calling program provide two sets UL and UR, with the two libraries using different sets.2 The result would be a perfectly reasonable definition, it just wouldn’t be the most standard one used to describe secret sharing. In many applications of secret sharing, it is in fact very convenient to give different users very different-looking kinds of values as their shares.
2These two sets would have to be the same size, since we cannot disguise the number of shares being returned by the QUERY subroutine.
3.2: A Simple 2-out-of-2 Scheme
Believe it or not, we have already seen a simple secret-sharing scheme! In fact, it might even be best to think of one-time pad as the simplest secret-sharing scheme, since by itself it is not so useful for encryption.
Construction 3.3: 2-out-of-2 TSSS
Since it’s a 2-out-of-2 scheme, the only authorized set of users is {1,2}, so we write Reconstruct to expect as input both shares s1 and s2. Correctness follows easily from what we’ve already learned about the properties of XOR.
Theorem 3.4:
Construction 3.3 is a secure 2-out-of-2 threshold secret-sharing scheme.
Proof:
Let Σ denote Construction 3.3. We will show that ℒΣtsss-L ≡ ℒΣtsss-R using a hybrid proof.
As usual, the starting point is ℒΣtsss-L, shown here with the details of the secret-sharing scheme filled in (and the types of the subroutine arguments omitted to reduce clutter).
It has no effect on the library’s behavior if we duplicate the main body of the library into 3 branches of a new if-statement. The reason for doing so is that the scheme generates s1 and s2 differently. This means that our proof will eventually handle the 3 different unauthorized sets ({1}, {2}, and ∅) in fundamentally different ways.
The definition of s2 has been changed in the first if-branch. This has no effect on the library’s behavior since s2 is never actually used in this branch.
Recognizing the second branch of the if-statement as a one-time pad encryption (of mL under key s1), we factor out the generation of s2 in terms of the library ℒOTPots-L from the one-time secrecy definition. This has no effect on the library’s behavior. Importantly, the subroutine in ℒOTPots-L expects two arguments, so that is what we must pass. We choose to pass mL and mR for reasons that should become clear very soon.
We have replaced ℒOTPots-L with ℒOTPots-R. From the one-time secrecy of one-time pad (and the composition lemma), this change has no effect on the library’s behavior
A subroutine has been inlined; no effect on the library’s behavior.
The code has been simplified. Specifically, the branches of the if-statement can all be unified, with no effect on the library’s behavior. The result is ℒΣtsss-R.
We showed that ℒΣtsss-L ≡ ℒhyb-1 ≡ · · · ≡ ℒhyb-5 ≡ ℒΣ tsss-R, and so the secret-sharing scheme is secure.
We in fact proved a slightly more general statement. The only property of one-time pad we used was its one-time secrecy. Substituting one-time pad for any other one-time secret encryption scheme would still allow the same proof to go through. So we actually proved the following:
Theorem 3.5:
If Σ is an encryption scheme with one-time secrecy, then the following 2-out-of-2 threshold secret-sharing scheme S is secure:
3.3: Polynomial Interpolation
You are probably familiar with the fact that two points determine a line (in Euclidean geometry). It is also true that 3 points determine a parabola, and so on. The next secretsharing scheme we discuss is based on the following principle:
A note on terminology: If ? is a polynomial that can be written as ?(x) = Σdi=0fixi, then we say that ? is a degree–d polynomial. It would be more technically correct to say that the degree of ? is at most d since we allow the leading coefficient ?d to be zero. For convenience, we’ll stick to saying “degree-d” to mean “degree at most d.”
Theorem 3.6: Poly Interpolation
Let p be a prime, and let {(x1,y1),. . . , (xd+1,yd+1)} ⊆ (ℤp)2 be a set of points whose xi values are all distinct. Then there is a unique degree-d polynomial ? with coefficients in ℤp that satisfies yi ≡ p ?(xi) for all i.
Proof:
Let ?(x) = Σdi=0?ixi be a degree-d polynomial. Consider the problem of evaluating ? on a set of points x1,. . . ,xd+1. We can express the values of ? as a linear function in the following way:
What’s notable about this expression is that V is a special kind of matrix called a Vandermonde matrix. A Vandermonde matrix is determined by the values x1,. . . ,xd+1. Then the (i,j) entry of the matrix is xj−1i. One important property of Vandermonde matrices (that we won’t prove here) is that the determinant of a square Vandermonde matrix is:
The Vandermonde matrix V in our expression is square, having dimensions (d + 1) × (d + 1). Also, since all of the xi values are distinct, the expression for the determinant must be nonzero. That means V is invertible!
So, knowing {(x1,y1),. . . , (xd+1,yd+1)}, we can solve for the coefficients of ? in the following equation:
Note the matrix inverse in the second equation. There is a unique solution for the vector f = (f0,. . . , fd), hence a unique degree-d polynomial f fitting the points.
The proof was written as if the linear algebra was over the real numbers (or complex numbers if you prefer). Indeed, the statement is true for real and complex numbers. However, all of the logic still holds when the linear equations are over ℤp, when p is a prime. Formally, this is because ℤp is a field (working modulo p, you have addition, multiplication, and inverses of nonzero elements). Most of what you know about linear algebra “just works” when matrix operations are in a field. Replacing the “=” sign (integer equality) with “≡p” (congruence modulo p), and all the steps of the proof still hold.
3.4: Shamir Secret Sharing
Part of the challenge in designing a secret-sharing scheme is making sure that any authorized set of users can reconstruct the secret. We have just seen that any d + 1 points on a degree-d polynomial are enough to reconstruct the polynomial. So a natural approach for secret sharing is to let each user’s share be a point on a polynomial.
That’s exactly what Shamir secret sharing does. To share a secret m ∈ ℤp with threshold t, we choose a degree-(t − 1) polynomial ? that satisfies ?(0) ≡p m, with all other coefficients chosen uniformly in ℤp. The ith user receives the point (i, ?(i)%p) on the polynomial. The interpolation theorem shows that any t shares can uniquely determine the polynomial ?, and hence recover the secret ?(0) ≡p m.
Construction 3.7: Shamir SSS
Correctness follows from the interpolation theorem. To show the scheme’s security, we first show a convenient lemma about the distribution of shares in an unauthorized set:
Lemma 3.8
Let p be a prime and define ℤ+p ≝ ℤp \ {0}. Then following two libraries are interchangeable:
In other words, if we evaluate a uniformly chosen degree t − 1 polynomial on fewer than t points, the results are (jointly) uniformly distributed.
Proof:
We will prove the lemma here for the special case where the calling program always provides a set U with |U| = t − 1. Exercise 3.5 deals with the more general case.
Fix a message m ∈ ℤp, fix set U of users with |U| = t − 1, and for each i ∈ U fix a value yi ∈ ℤp. We wish to consider the probability that a call to QUERY(m,t,U) outputs ((i,yi))i ∈ U, in each of the two libraries.
In library ℒssss-rand, this event happens with probability 1/pt − 1 since QUERY chooses the t − 1 different yi values uniformly in ℤp.
In library ℒssss-real, the event happens if and only if the degree-(t − 1) polynomial ?(x) chosen by QUERY happens to pass through the set of points ? = {(i,yi) | i ∈ U } ∪ {(0,m)}. These are t points with distinct x-coordinates, so by Theorem 3.6 there is a unique degree-(t − 1) polynomial ? with coefficients in ℤp passing through these points.
The QUERY subroutine picks ? uniformly from the set of degree-(t − 1) polynomials satisfying ?(0) ≡p m, of which there are pt−1. Exactly one such polynomial causes the event in question, so the probability of the event is 1/pt − 1.
Since the two libraries assign the same probability to all outcomes, we have ℒssss-real ≡ ℒssss-rand.
Theorem 3.9:
Shamir’s secret-sharing scheme (Construction 3.7) is secure according to Definition 3.2.
Proof:
Let S denote the Shamir secret-sharing scheme. We prove that ℒStsss-L ≡ ℒStsss-R via a hybrid argument.
Our starting point is ℒStsss-L, shown here with the details of Shamir secret-sharing filled in.
Almost the entire body of the QUERY subroutine has been factored out in terms of the ℒssss-real library defined above. The only thing remaining is the “choice” of whether to share mL or mR. Restructuring the code in this way has no effect on the library’s behavior.
By Lemma 3.8, we can replace ℒssss-real with ℒssss-rand, having no effect on the library’s behavior.
The argument to QUERY’ has been changed from mL to mR. This has no effect on the library’s behavior, since QUERY’ is actually ignoring its argument in these hybrids.
Applying the same steps in reverse, we can replace ℒssss-rand with ℒssss-real, having no effect on the library’s behavior.
A subroutine has been inlined, which has no effect on the library’s behavior. The resulting library is ℒStsss-R.
We showed that ℒStsss-L ≡ ℒhyb-1 ≡ · · · ≡ ℒhyb-4 ≡ ℒStsss-R, so Shamir’s secret sharing scheme is secure.
3.5: Visual Secret Sharing
Here is a fun (but not particularly useful) variant of 2-out-of-2 secret-sharing. Visual secret sharing is a secret-sharing technique where the secret and both shares are black-and-white images. We require the same security property as traditional secret-sharing — that is, a single share (image) by itself reveals no information about the secret (image). What makes visual secret sharing different is that we require the reconstruction procedure to be done visually.
More specifically, we imagine each share to be printed on transparent sheets. When the two shares are stacked on top of each other, the secret image is revealed visually. We will discuss a simple visual secret sharing scheme that is inspired by the following observations:
Importantly, when stacking shares on top of each other in the first two cases, the result is a 2 × 2 block that is half-black, half white (let’s call it “gray”); while in the other cases the result is completely black.
So the idea is to process each pixel of the source image independently, and to encode each pixel as a 2 × 2 block of pixels in each of the shares. A white pixel should be shared in a way that the two shares stack to form a “gray” 2 × 2 block, while a black pixel is shared in a way that results in a black 2 × 2 block.
More formally:
Construction 3.10
It is not hard to see that share s1 leaks no information about the secret image m, because it consists of uniformly chosen 2 × 2 blocks. In the exercises you are asked to prove that s2 also individually leaks nothing about the secret image.
Note that whenever the source pixel is white, the two shares have identical 2×2 blocks (so that when stacked, they make a “gray” block). Whenever a source pixel is black, the two shares have opposite blocks, so stack to make a black block.
Example
Exercises:
3.1: Generalize Construction 3.3 to be an n-out-of-n secret-sharing scheme, and prove that your scheme is correct and secure.
3.2: Prove Theorem 3.5.
3.3: Fill in the details of the following alternative proof of Theorem 3.4: Starting with ℒtsss-L, apply the first step of the proof as before, to duplicate the main body into 3 branches of a new if-statement. Then apply Exercise 2.8 to the second branch of the if statement. Argue that mL can be replaced with mR and complete the proof.
3.4: Suppose T is a fixed (publicly known) invertible n × n matrix over ℤp, where p is a prime.
(a). Show that the following two libraries are interchangeable:
(b). Show that the following two libraries are interchangeable:
3.5: The text gives a proof of Lemma 3.8 for the special case where the calling program always calls QUERY with |U| = t − 1. This exercise shows one way to complete the proof. Define the following “wrapper” library:
(a). Argue that ℒsss-real ≡ ℒwrap ◊ ℒ’sss-real, where on the right-hand side ℒ’sss-real refers to ℒsss-real but with its QUERY subroutine renamed to QUERY’.
(b). Argue that ℒsss-rand ≡ ℒwrap ◊ ℒ’sss-rand, with the same interpretation as above.
(c). Argue that for any calling program ?, the combined program ? ◊ ℒwrap only calls QUERY’ with |U| = t − 1. Hence, the proof presented in the text applies when the calling program has the form ? ◊ ℒwrap.
(d). Combining the previous parts, show that ℒsss-real ≡ ℒsss-rand (i.e., the two libraries are interchangeable with respect to arbitrary calling programs).
3.6: Let S be a TSSS with ? = {0,1}ℓ and where each share is also a string of bits. Prove that if S satisfies security then every user’s share must be at least ℓ bits long.
Hint: Prove the contrapositive. Suppose the first user’s share is less than ℓ bits (and that this fact is known to everyone). Show how users 2 through t can violate security by enumerating all possibilities for the first user’s share.
3.7: n users have shared two secrets using Shamir secret sharing. User i has a share si = (i,yi) of the secret m, and a share s’i = (i,y’i) of the secret m’ . Both sets of shares use the same prime modulus p.
Suppose each user i locally computes zi = yi + y’i % p.
(a). Prove that if the shares of m and shares of m’ had the same threshold, then the resulting {(i,zi) | i ≤ n} are a valid secret-sharing of the secret m + m‘.
(b). Describe what the users get when the shares of m and m’ had different thresholds (say, t and t‘, respectively).
3.8: Suppose there are 5 people on a committee: Alice (president), Bob, Charlie, David, Eve. Suggest how they can securely share a secret so that it can only be opened by:
- Alice and any one other person
- Any three people
Describe in detail how the sharing algorithm works and how the reconstruction works (for all authorized sets of users).
3.9: Suppose there are 9 people on an important committee: Alice, Bob, Carol, David, Eve, Frank, Gina, Harold, & Irene. Alice, Bob & Carol form a subcommittee; David, Eve & Frank form another subcommittee; and Gina, Harold & Irene form another subcommittee.
Suggest how a dealer can share a secret so that it can only be opened when a majority of each subcommittee is present. Describe why a 6-out-of-9 threshold secret-sharing scheme does not suffice
3.10: Generalize the previous exercise. A monotone formula is a boolean function Φ :{0,1}n → {0,1} that when written as a formula uses only AND and OR operations (no NOTS). For a set A ⊆ {1,. . . , n}, let χA be the bitstring where whose ith bit is 1 if and only if i ∈ A.
For every monotone formula ϕ : {0,1}n → {0,1}, construct a secret-sharing scheme whose authorized sets are {A ⊆ {1,. . . , n} | ϕ(χA) = 1}. Prove that your scheme is secure.
Hint: express the formula as a tree of AND and OR gates.
3.11: Prove that share s2 in Construction 3.10 is distributed independently of the secret m.
3.12: Construct a 3-out-of-3 visual secret sharing scheme. Any two shares should together reveal nothing about the source image, but any three reveal the source image when stacked together.
3.13: Using actual transparencies or with an image editing program, reconstruct the secret shared in these two images: