Chapter 0: Review of Concepts & Notation
Return to Table of Contents
Modular Arithmetic
We write the set of integers and the set of natural numbers as:
Theorem 0.1: Divison Theorem
For all a,n ∈ ℤ with n ≠ 0, there exist unique q,r ∈ ℤ satisfying a = qn + r and 0 ≤ r < |n|. Since q and r are unique, we use ⎣a/n⎦ to denote q and a % n to denote r. Hence:
The % symbol is often called the modulo operator. Beware that some programming languages also have a % operator in which a % n always has the same sign as a. We will instead use the convention that a % n is always always nonnegative.
Definition 0.2
For x,n ∈ ℤ, we say that n divides x (or x is a multiple of n), and write n | x, if there exists an integer k such that kn = x.
We say that a and b are congruent modulo n, and write a ≡n b, if n | (a − b). Equivalently, a ≡n b if and only if a % n = b % n.
We write ℤn ≝ {0,. . . ,n − 1} to denote the set of integers modulo n.
In other textbooks you may have seen “a ≡n b” written as “a ≡ b (mod n)”.
There is a subtle — and often confusing — distinction between the expressions “a % n = b” and “a ≡n b.” In the first expression, “a % n” refers to an integer that is always between 0 and n − 1, and the equals sign denotes equality over the integers. In the second expression, the “≡n” symbol denotes congruence modulo n, which is a weaker condition than equality over the integers. Note that a = b implies a ≡n b, but not vice-versa.
Example
99 ≡10 19 because 10 divides 99 − 19 according to the definition. But 99 ≠ 19 % 10 because the right-hand side evaluates to the integer 19 % 10 = 9, which is not the same integer as the left-hand side 99.
When adding, subtracting, and multiplying modulo n, it doesn’t affect the final result to reduce intermediate steps modulo n. That is, we have the following facts about modular arithmetic:
Division is not always possible in ℤn; we will discuss this fact later in the class.
Strings
We write {0,1}n to denote the set of n-bit binary strings, and {0,1}∗ to denote the set of all (finite-length) binary strings. When x is a string of bits, we write |x| to denote the length (in bits) of that string, and we write x̄ to denote the result of flipping every bit in x. When it’s clear from context that we’re talking about strings instead of numbers, we write 0n and 1n to denote strings of n zeroes and n ones, respectively.
When x and y are strings of the same length, we write x ⊕ y to denote the bitwise exclusive-or (XOR) of the two strings. So, for example, 0011 ⊕ 0101 = 0110. The following facts about the XOR operation are frequently useful:
As a corollary:
We use notation x‖y to denote the concatenation of two strings x and y.
Functions
Let X and Y be finite sets. A function ? : X → Y is:
Probability
Definition 0.3
A (discrete) probability distribution ? over a set X of outcomes is a function ? : X → [0,1] that satisfies the condition:
We say that ? assigns probability ?(x) to outcome x. The setX is referred to as the support of ?.
A special distribution is the uniform distribution over a finite set X, which assigns probability 1/|X| to every element of X.
Let ? be a probability distribution over X. We write Pr?[A] to denote the probability of an event A, where probabilities are according to distribution ?. Typically the distribution ? is understood from context, and we omit it from the notation. Formally, an event is a subset of the support X, but it is typical to write Pr[cond] where “cond” is the condition that defines an event A = {x ∈ X | x satisfies condition cond}. Interpreting A strictly as a set, we have Pr?[A] ≝ Σx∈A?(x).
The conditional probability of A given B is defined as Pr[A | B] ≝ Pr[A ∧ B]/ Pr[B]. When Pr[B] = 0, we let Pr[A | B] = 0 by convention, to avoid dividing by zero.
Below are some convenient facts about probabilities:
Precise Terminology
It is common and tempting to use the word “random” when one really means “uniformly at random.” We’ll try to develop the habit of being more precise about this distinction.
It is also tempting to describe an outcome as either random or uniform. For example, one might want to say that “the string x is random.” But an outcome is not random; the process that generated the outcome is random. After all, there are many ways to come up with the same string x, and not all of them are random. So randomness is a property of the process and not an inherent property of the result of the process.
It’s more precise and a better mental habit to say that an outcome is “sampled or chosen randomly,” and it’s even better to be precise about what the random process was. For example, “the string x is chosen uniformly.”
Notation in Pseudocode
When ? is a probability distribution, we write “x ← ?” to mean that the value of x is sampled according to the distribution ?. We overload the “←” notation slightly, writing “x ← X” when X is a finite set to mean that x is sampled from the uniform distribution over X.
We will often discuss algorithms that make some random choices. When describing such algorithms, we will use statements like “x ← ?” in the pseudocode. If ? is an algorithm that takes input and also makes some internal random choices, then we can think of the output of ?(x) as a distribution — possibly a different distribution for each input x. Then we write “y ← ?(x)” to mean the natural thing: run ? on input x and assign the output to y. The use of the arrow “←” rather than an assignment operator “:=” is meant to emphasize that, even when x is fixed, the output ?(x) is a random variable depending on internal random choices made by ?.
Asymptotics (Big-O)
Let ? : ℕ → ℕ be a function. We characterize the asymptotic growth of ? in the following ways: