# Chapter 3: Secret Sharing

### Return to Table of Contents

DNS is the system that maps human-memorable Internet domains like irs.gov to machine-readable IP addresses like 166.123.218.220. If an attacker can masquerade as the DNS system and convince your computer that irs.gov actually resides at some other IP address, it’s not hard to imagine the kind of trouble you might be facing.

To protect against these kinds of attacks, a replacement called DNSSEC is slowly being rolled out. DNSSEC uses cryptography to make it impossible to falsify a domain-name mapping. The cryptography required to authenticate DNS mappings is certainly interesting, but an even more fundamental question is, who can be trusted with the master cryptographic keys to the system? The non-profit organization in charge of these kinds of things (ICANN) has chosen the following system. The master key is split into 7 pieces and distributed on smart cards to 7 geographically diverse people, who keep them in safe-deposit boxes.

*At least five key-holding members of this fellowship would have to meet at a secure data center in the United States to reboot [DNSSEC] in case of a very unlikely system collapse.*

*“If you round up five of these guys, they can decrypt [the root key] should the West Coast fall in the water and the East Coast get hit by a nuclear bomb,” Richard Lamb, program manager for DNSSEC at ICANN, told TechNewsDaily.*^{1}

How is it possible that any 5 out of the 7 key-holders can reconstruct the master key, but (presumably) 4 out of the 7 cannot? The solution lies in a cryptographic tool called a **secret-sharing scheme**, the topic of this chapter.

**3.1: Definitions:**

### Definition 3.1: Secret-Sharing

*A t-out-of-n threshold secret-sharing scheme (TSSS) consists of the following algorithms:*

- Share:
*a randomized algorithm that takes a*?**message**m ∈*as input, and outputs a sequence s = (s*_{1},. . . ,s_{n}) of**shares**. - Reconstruct:
*a deterministic algorithm that takes a collection of t or more shares as input, and outputs a message.*

*We call *?* the message space of the scheme, and t its threshold. As usual, we refer to the parameters/components of a scheme *Σ

*as*Σ.

*t,*Σ.

*n,*Σ.?

*,*Σ.Share, Σ.Reconstruct.

*The correctness property for such a scheme is that any subset of at least t shares is enough to reconstruct the message m. That is, for all sets U* ⊆ {1,. . . ,

*n*}

*with*|

*X*| ≥

*t and for all s ←*Share(

*m*),

*we have*Reconstruct ((

*s*)

_{i}_{i ∈ U}) =

*m*.

In typical usage, then shares are distributed to *n* different users. In the above definition, the set *U* corresponds to a set of users. If |*U*| ⩾ t, we say that the set is **authorized**; otherwise the set is **unauthorized**.

Just like the encryption definition does not address the problem of key distribution, the secret-sharing definition does not address the problem of *who* should run the Share algorithm (if its input *m* is so secret that it cannot be entrusted to any single person), nor of *how* the shares are meant to be sent securely to the *n* different users. Those concerns are considered out of scope by the problem of secret-sharing (although we later discuss clever approaches to the first problem). Rather, the focus is simply on whether it is even possible to encode data in such a way that it can be recovered (only) if a threshold number of shares are present.

#### Security Definition

We’d like a security guarantee that says something like:

if you have an unauthorized set of shares, then you learn no information about the choice of secret message.

To translate this informal statement into a formal security definition, we follow the The Central Principle of Security Definitions^{TM} and define two libraries that allow the calling program to learn a set of shares (for an *unauthorized* set), and that differ only in which secret is shared. If the two libraries are interchangeable, then we conclude that seeing an unauthorized set of shares leaks no information about the choice of secret message. The definition looks like this:

**Definition 3.2: TSSS Security**

*Let *Σ* be a threshold secret-sharing scheme. We say that *Σ* is secure if *ℒ

^{Σ}

_{tsss-L}≡ ℒ

^{Σ}

_{tsss-R},

*where:*

*In an attempt to keep the notation uncluttered, we have not written the type of the argument U , which is U* ⊆ {1,. . . ,Σ.*n*}.

### Discussion:

- Similar to the definition of one-time secrecy of encryption, we let the calling program choose the two secret messages that will be shared. As before, this models the fact that an adversary may have partial knowledge or influence over what inputs might be used in the secret-sharing scheme.
- The calling program also chooses the set
*U*of users’ shares to obtain. The libraries make it impossible for the calling program to obtain the shares of an*authorized*set (returning err in that case). - Consider a 6-out-of-10 threshold secret-sharing scheme. With the libraries above, the calling program can receive the shares of users {1,. . . ,5} (an unauthorized set) in one call to QUERY, and then receive the shares of users {6,. . . ,10} in another call. It might seem like the calling program can then combine these shares to reconstruct the secret (if the same message was shared in both calls). However, this is
*not*the case because these two sets of shares came from two*independent executions*of the Share algorithm. Shares generated by one call to Share should not be expected to function with shares generated by another call, even if both calls to Share used the same secret message. - The calling program sees shares corresponding to the
*same*set of users in both libraries. Recall that interchangeable libraries hide their internal*differences*. In this case, the set of users is the same in the two libraries, so the security definition does not require shares to hide the*identity*of the users they belong to. Only the secret*message*needs to be hidden according to this definition. Indeed, if each user’s identity were appended to his/her corresponding share, it would have no effect on whether ℒ^{Σ}_{tsss-L}≡ ℒ^{Σ}_{tsss-R}.

One could certainly write a security definition that required shares to be “anonymous” in this way, hiding the identity of their associated user. This could be accomplished by having the calling program provide two sets *U _{L}* and

*U*, with the two libraries using different sets.

_{R}^{2}The result would be a perfectly reasonable definition, it just wouldn’t be the most standard one used to describe secret sharing. In many applications of secret sharing, it is in fact very convenient to give different users very different-looking kinds of values as their shares.

^{2}These two sets would have to be the same size, since we cannot disguise the number of shares being returned by the QUERY subroutine.

**3.2: A Simple 2-out-of-2 Scheme**

Believe it or not, we have already seen a simple secret-sharing scheme! In fact, it might even be best to think of one-time pad as the simplest secret-sharing scheme, since by itself it is not so useful for encryption.

### Construction 3.3: 2-out-of-2 TSSS

Since it’s a 2-out-of-2 scheme, the only authorized set of users is {1,2}, so we write Reconstruct to expect as input both shares *s _{1}* and

*s*. Correctness follows easily from what we’ve already learned about the properties of XOR.

_{2}

### Theorem 3.4:

*Construction 3.3 is a secure 2-out-of-2 threshold secret-sharing scheme.*

#### Proof:

Let Σ denote Construction 3.3. We will show that ℒ^{Σ}_{tsss-L} ≡ ℒ^{Σ}_{tsss-R} using a hybrid proof.

As usual, the starting point is ℒ^{Σ}_{tsss-L}, shown here with the details of the secret-sharing scheme filled in (and the types of the subroutine arguments omitted to reduce clutter).

It has no effect on the library’s behavior if we duplicate the main body of the library into 3 branches of a new if-statement. The reason for doing so is that the scheme generates *s*_{1} and *s*_{2} differently. This means that our proof will eventually handle the 3 different unauthorized sets ({1}, {2}, and ∅) in fundamentally different ways.

The definition of *s*_{2} has been changed in the first if-branch. This has no effect on the library’s behavior since *s*_{2} is never actually used in this branch.

Recognizing the second branch of the if-statement as a one-time pad encryption (of *m _{L}* under key

*s*

_{1}), we factor out the generation of

*s*

_{2}in terms of the library ℒ

^{OTP}

_{ots-L}from the one-time secrecy definition. This has no effect on the library’s behavior. Importantly, the subroutine in ℒ

^{OTP}

_{ots-L}expects two arguments, so that is what we must pass. We choose to pass

*m*and

_{L}*m*for reasons that should become clear very soon.

_{R}We have replaced ℒ^{OTP}_{ots-L} with ℒ^{OTP}_{ots-R}. From the one-time secrecy of one-time pad (and the composition lemma), this change has no effect on the library’s behavior

A subroutine has been inlined; no effect on the library’s behavior.

The code has been simplified. Specifically, the branches of the if-statement can all be unified, with no effect on the library’s behavior. The result is ℒ^{Σ}_{tsss-R}.

We showed that ℒ^{Σ}_{tsss-L} ≡ ℒ_{hyb-1} ≡ · · · ≡ ℒ_{hyb-5} ≡ ℒ^{Σ} _{tsss-R}, and so the secret-sharing scheme is secure.

We in fact proved a slightly more general statement. The only property of one-time pad we used was its one-time secrecy. Substituting one-time pad for any other one-time secret encryption scheme would still allow the same proof to go through. So we actually proved the following:

### Theorem 3.5:

*If *Σ* is an encryption scheme with one-time secrecy, then the following 2-out-of-2 threshold secret-sharing scheme S is secure:*

**3.3: Polynomial Interpolation**

You are probably familiar with the fact that two points determine a line (in Euclidean geometry). It is also true that 3 points determine a parabola, and so on. The next secretsharing scheme we discuss is based on the following principle:

*d*+ 1 points determine a

*unique*degree-

*d*polynomial, and this is true even working modulo a prime.

A note on terminology: If ? is a polynomial that can be written as ?(*x*) = Σ^{d}_{i=0}*f _{i}x^{i}*, then we say that ? is a

**degree**–

*d*polynomial. It would be more technically correct to say that the degree of ? is at most

*d*since we allow the leading coefficient ?

*to be zero. For convenience, we’ll stick to saying “degree-*

_{d}*d*” to mean “degree at most

*d*.”

### Theorem 3.6: Poly Interpolation

*Let p be a prime, and let* {(*x*_{1},*y*_{1}),. . . , (*x*_{d+1},*y*_{d+1})} ⊆ (ℤ_{p})^{2} *be a set of points whose x _{i} values are all distinct. Then there is a unique degree-d polynomial* ?

*with coefficients in*ℤ

_{p}that satisfies

*y*≡

_{i}_{p}?(

*x*)

_{i}*for all i.*

#### Proof:

Let ?(*x*) = Σ^{d}_{i=0}?_{i}*x*^{i} be a degree-*d* polynomial. Consider the problem of evaluating ? on a set of points *x*_{1},. . . ,*x*_{d+1}. We can express the values of ? as a linear function in the following way:

What’s notable about this expression is that *V* is a special kind of matrix called a **Vandermonde** matrix. A Vandermonde matrix is determined by the values *x*_{1},. . . ,*x*_{d+1}. Then the (*i,j*) entry of the matrix is *x*^{j−1}_{i}. One important property of Vandermonde matrices (that we won’t prove here) is that the determinant of a square Vandermonde matrix is:

The Vandermonde matrix *V* in our expression is square, having dimensions (*d* + 1) × (*d* + 1). Also, since all of the *x _{i}* values are distinct, the expression for the determinant must be nonzero. That means V is

**invertible**!

So, knowing {(*x*_{1},*y*_{1}),. . . , (*x*_{d}+1,*y*_{d+1})}, we can solve for the coefficients of ? in the following equation:

Note the matrix inverse in the second equation. There is a unique solution for the vector *f* = (*f _{0},. . . , f_{d}*), hence a unique degree-

*d*polynomial

*f*fitting the points.

The proof was written as if the linear algebra was over the real numbers (or complex numbers if you prefer). Indeed, the statement is true for real and complex numbers. However, all of the logic still holds when the linear equations are over ℤ_{p}, when *p* is a prime. Formally, this is because ℤ_{p} is a *field* (working modulo *p*, you have addition, multiplication, and *inverses* of nonzero elements). Most of what you know about linear algebra “just works” when matrix operations are in a field. Replacing the “=” sign (integer equality) with “≡*p*” (congruence modulo *p*), and all the steps of the proof still hold.

**3.4: Shamir Secret Sharing**

Part of the challenge in designing a secret-sharing scheme is making sure that *any* authorized set of users can reconstruct the secret. We have just seen that *any* *d* + 1 points on a degree-*d* polynomial are enough to reconstruct the polynomial. So a natural approach for secret sharing is to let each user’s share be a point on a polynomial.

That’s exactly what **Shamir secret sharing** does. To share a secret *m* ∈ ℤ_{p} with threshold *t*, we choose a degree-(*t* − 1) polynomial ? that satisfies ?(0) ≡* _{p} m*, with all other coefficients chosen uniformly in ℤ

_{p}. The

*i*th user receives the point (

*i, ?(i)%p*) on the polynomial. The interpolation theorem shows that any

*t*shares can uniquely determine the polynomial ?, and hence recover the secret ?(0) ≡

*.*

_{p}m

### Construction 3.7: Shamir SSS

Correctness follows from the interpolation theorem. To show the scheme’s security, we first show a convenient lemma about the distribution of shares in an unauthorized set:

### Lemma 3.8

*Let p be a prime and define *ℤ* ^{+}_{p }*≝ ℤ

_{p}\ {0}. Then following two libraries are interchangeable:

*In other words, if we evaluate a uniformly chosen degree t* − 1 *polynomial on fewer than t points, the results are (jointly) uniformly distributed.*

#### Proof:

We will prove the lemma here for the special case where the calling program always provides a set *U* with |*U*| = *t* − 1. Exercise 3.5 deals with the more general case.

Fix a message *m* ∈ ℤ_{p}, fix set *U* of users with |*U*| = *t* − 1, and for each *i* ∈ *U* fix a value *y _{i}* ∈ ℤ

_{p}. We wish to consider the probability that a call to QUERY(

*m,t,U*) outputs ((

*i,y*

_{i}))

_{i ∈ U}, in each of the two libraries.

In library ℒ_{ssss-rand}, this event happens with probability 1/*p*^{t − 1} since QUERY chooses the *t* − 1 different *y _{i}* values uniformly in ℤ

_{p}.

In library ℒ_{ssss-real}, the event happens if and only if the degree-(*t* − 1) polynomial ?(*x*) chosen by QUERY happens to pass through the set of points ? = {(*i,y _{i}*) |

*i*∈

*U*} ∪ {(0,

*m*)}. These are

*t*points with distinct

*x*-coordinates, so by Theorem 3.6 there is a

*unique*degree-(

*t*− 1) polynomial ? with coefficients in ℤ

_{p}passing through these points.

The QUERY subroutine picks ? uniformly from the set of degree-(*t* − 1) polynomials satisfying ?(0) ≡* _{p} m*, of which there are

*p*

^{t−1}. Exactly one such polynomial causes the event in question, so the probability of the event is 1/

*p*

^{t − 1}.

Since the two libraries assign the same probability to all outcomes, we have ℒ_{ssss-real} ≡ ℒ_{ssss-rand}.

### Theorem 3.9:

*Shamir’s secret-sharing scheme (Construction 3.7) is secure according to Definition 3.2.*

### Proof:

Let *S* denote the Shamir secret-sharing scheme. We prove that ℒ^{S}_{tsss-L} ≡ ℒ^{S}_{tsss-R} via a hybrid argument.

Our starting point is ℒ^{S}_{tsss-L}, shown here with the details of Shamir secret-sharing filled in.

Almost the entire body of the QUERY subroutine has been factored out in terms of the ℒ_{ssss-real} library defined above. The only thing remaining is the “choice” of whether to share *m _{L}* or

*m*. Restructuring the code in this way has no effect on the library’s behavior.

_{R}By Lemma 3.8, we can replace ℒ_{ssss-real} with ℒ_{ssss-rand}, having no effect on the library’s behavior.

The argument to QUERY’ has been changed from *m _{L}* to

*m*. This has no effect on the library’s behavior, since QUERY’ is actually ignoring its argument in these hybrids.

_{R}Applying the same steps in reverse, we can replace ℒ_{ssss-rand} with ℒ_{ssss-real}, having no effect on the library’s behavior.

A subroutine has been inlined, which has no effect on the library’s behavior. The resulting library is ℒ^{S}_{tsss-R}.

We showed that ℒ^{S}_{tsss-L} ≡ ℒ_{hyb-1} ≡ · · · ≡ ℒ_{hyb-4} ≡ ℒ^{S}_{tsss-R}, so Shamir’s secret sharing scheme is secure.

**3.5: Visual Secret Sharing**

Here is a fun (but not particularly useful) variant of 2-out-of-2 secret-sharing. **Visual secret sharing** is a secret-sharing technique where the secret and both shares are black-and-white images. We require the same security property as traditional secret-sharing — that is, a single share (image) by itself reveals no information about the secret (image). What makes visual secret sharing different is that we require the reconstruction procedure to be done visually.

More specifically, we imagine each share to be printed on transparent sheets. When the two shares are stacked on top of each other, the secret image is revealed visually. We will discuss a simple visual secret sharing scheme that is inspired by the following observations:

Importantly, when stacking shares on top of each other in the first two cases, the result is a 2 × 2 block that is half-black, half white (let’s call it “gray”); while in the other cases the result is completely black.

So the idea is to process each pixel of the source image independently, and to encode each pixel as a 2 × 2 block of pixels in each of the shares. A white pixel should be shared in a way that the two shares stack to form a “gray” 2 × 2 block, while a black pixel is shared in a way that results in a black 2 × 2 block.

More formally:

### Construction 3.10

It is not hard to see that share *s*_{1} leaks no information about the secret image *m*, because it consists of uniformly chosen 2 × 2 blocks. In the exercises you are asked to prove that *s*_{2} also individually leaks nothing about the secret image.

Note that whenever the source pixel is white, the two shares have identical 2×2 blocks (so that when stacked, they make a “gray” block). Whenever a source pixel is black, the two shares have opposite blocks, so stack to make a black block.

#### Example

**Exercises:**

3.1: Generalize Construction 3.3 to be an *n*-out-of-*n* secret-sharing scheme, and prove that your scheme is correct and secure.

3.2: Prove Theorem 3.5.

3.3: Fill in the details of the following alternative proof of Theorem 3.4: Starting with ℒ_{tsss-L}, apply the first step of the proof as before, to duplicate the main body into 3 branches of a new if-statement. Then apply Exercise 2.8 to the second branch of the if statement. Argue that *m _{L}* can be replaced with

*m*and complete the proof.

_{R}3.4: Suppose *T* is a fixed (publicly known) invertible *n* × *n* matrix over ℤ_{p}, where *p* is a prime.

(a). Show that the following two libraries are interchangeable:

(b). Show that the following two libraries are interchangeable:

3.5: The text gives a proof of Lemma 3.8 for the special case where the calling program always calls QUERY with |*U*| = *t* − 1. This exercise shows one way to complete the proof. Define the following “wrapper” library:

(a). Argue that ℒ_{sss-real} ≡ ℒ_{wrap} ◊ ℒ’_{sss-real}, where on the right-hand side ℒ’_{sss-real} refers to ℒ_{sss-real} but with its QUERY subroutine renamed to QUERY’.

(b). Argue that ℒ_{sss-rand} ≡ ℒ_{wrap} ◊ ℒ’_{sss-rand}, with the same interpretation as above.

(c). Argue that for any calling program ?, the combined program ? ◊ ℒ_{wrap} only calls QUERY’ with |*U*| = *t* − 1. Hence, the proof presented in the text applies when the calling program has the form ? ◊ ℒ_{wrap}.

(d). Combining the previous parts, show that ℒ_{sss-real} ≡ ℒ_{sss-rand} (i.e., the two libraries are interchangeable with respect to *arbitrary* calling programs).

3.6: Let *S* be a TSSS with ? = {0,1}^{ℓ} and where each share is also a string of bits. Prove that if *S* satisfies security then every user’s share must be at least *ℓ* bits long.

*Hint:* Prove the contrapositive. Suppose the first user’s share is less than *ℓ* bits (and that this fact is known to everyone). Show how users 2 through *t* can violate security by enumerating all possibilities for the first user’s share.

3.7: *n* users have shared two secrets using Shamir secret sharing. User *i* has a share *s _{i}* = (

*i,y*) of the secret

_{i}*m*, and a share

*s’*= (

_{i}*i,y’*) of the secret

_{i}*m’*. Both sets of shares use the same prime modulus

*p*.

Suppose each user *i* locally computes *z _{i}* =

*y*+

_{i}*y’*.

_{i}% p(a). Prove that if the shares of *m* and shares of *m’* had the same threshold, then the resulting {(*i,z _{i}*) |

*i ≤ n*} are a valid secret-sharing of the secret

*m*+

*m*‘.

(b). Describe what the users get when the shares of *m* and *m’* had different thresholds (say, *t* and *t*‘, respectively).

3.8: Suppose there are 5 people on a committee: Alice (president), Bob, Charlie, David, Eve. Suggest how they can securely share a secret so that it can only be opened by:

- Alice and any one other person
- Any three people

Describe in detail how the sharing algorithm works and how the reconstruction works (for all authorized sets of users).

3.9: Suppose there are 9 people on an important committee: Alice, Bob, Carol, David, Eve, Frank, Gina, Harold, & Irene. Alice, Bob & Carol form a subcommittee; David, Eve & Frank form another subcommittee; and Gina, Harold & Irene form another subcommittee.

Suggest how a dealer can share a secret so that it can only be opened when a majority of each subcommittee is present. Describe why a 6-out-of-9 threshold secret-sharing scheme does **not** suffice

3.10: Generalize the previous exercise. A **monotone formula** is a boolean function *Φ* :{0,1}^{n} → {0,1} that when written as a formula uses only AND and OR operations (no NOTS). For a set *A* ⊆ {1,. . . , *n*}, let *χ _{A}* be the bitstring where whose

*i*th bit is 1 if and only if

*i*∈

*A*.

For every monotone formula *ϕ* : {0,1}^{n} → {0,1}, construct a secret-sharing scheme whose authorized sets are {*A* ⊆ {1,. . . , *n*} | *ϕ*(*χ _{A}*) = 1}. Prove that your scheme is secure.

*Hint*: express the formula as a tree of AND and OR gates.

3.11: Prove that share *s*_{2} in Construction 3.10 is distributed independently of the secret *m*.

3.12: Construct a 3-out-of-3 visual secret sharing scheme. Any two shares should together reveal nothing about the source image, but any three reveal the source image when stacked together.

3.13: Using actual transparencies or with an image editing program, reconstruct the secret shared in these two images: