With my qualifying exam coming up in a couple months, I figured I could document some of the things I’ll be studying. For instance, as a part of my algorithms course, I learned RSA encryption. Unfortunately, the algorithm alone is pretty complicated, but I’m also responsible for understanding the number theory behind it. Time to brush up on my modular arithmetic!
RSA Encryption Overview
Before we get into the details, I figured we could start by talking about what RSA encryption is and how it works at a high level.
RSA encryption—which comes from the names of the inventors: Rivest, Shamir, and Adelman—is a method of encryption which relies on a trapdoor one-way function to generate a pair of keys for data encryption. One key is known as the private key, and it’s kept hidden for personal use. Meanwhile, the other key is known as the public key, and it’s distributed to anyone who might use it.
In tandem, these keys are used to exchange encrypted messages between individuals. For example, if I wanted to send you a message, I would encrypt it using your public key. To read it, you would decrypt it using your private key. Naturally, once the message is encrypted with the public key, only the private key can be used to decrypt it and vice versa.
In the case of RSA, the one-way function which is used to generate the keys is derived from the difficulty of prime factorization, the ability to decompose a number into its prime factors. In other words, RSA encryption ensures that it is easy to generate a pair of keys, but it’s very hard to figure out one of the keys given the other.
Regardless, in the following sections, I’ll cover a bit about the number theory behind RSA encryption, and I’ll cover the actual RSA encryption algorithm. A lot of this content is borrowed from The Ohio State University’s CSE 6331 lecture notes, but all the analysis is strictly my own.
Number Theory Background
To understand the algorithm behind RSA encryption, it’s helpful to have a little bit of background in number theory.
Modular Arithmetic
Much of RSA encryption is built off of modular arithmetic which uses a number system comprised of integers that wrap around at some limit. For most developers, modular arithmetic is practically second nature as integers in most programming languages have limits. However, the math behind the concept is significantly more complex.
Before we get into any of that fun stuff, let’s talk about the modulo operator. In many languages, the modulo operator is the percent sign (%). Most older languages don’t have a true modulo operator. Instead, they have a remainder operator. The difference is subtle, but it is one that matters.
With a true modulo operator, the modulus defines the range of values that you can cycle through. For example, 13 % 5
would be 3 because 13 cycles around 5 twice before settling on 3—like a clock with only five ticks (1, 2, 3, 4, 5 (0)). Coincidentally, 13 divided by 5 also has a remainder of 3, so it’s no surprise that the operations get confused.
That said, things get interesting when negative numbers are introduced. For instance, 13 % -5
would be -2 whereas the remainder would still be 3. In the case of modulo, we’ve defined a new cycle which contains only negative values (-5 (0), -4, -3, -2, -1).
For my own sanity, I actually tested a few of these in Python (true mod) and in Java (remainder). Here are the results:
>>> 17 % 4
1
>>> 17 % -4
-3
>>> -17 % 4
3
>>> -17 % -4
-1
> 17 % 4
1
> 17 % -4
1
> -17 % 4
-1
> -17 % -4
-1
With modular arithmetic, we get four distinct answers—one for each cycle orientation and direction. Meanwhile, remainder only gives us two—one for each dividend.
Congruence
While modular arithmetic alone isn’t all that interesting, it has some fun properties. In particular, we can start talking about congruence. To do that, we should probably cover a new convention called divides.
In mathematics, we show that a
divides b
with a new symbol: |
. For example, 3|9
states that 3 divides 9. While it’s a simple symbol, we can use it to define congruence.
Now, a
is congruent ( ≡ ) to b mod n
if n|(a-b)
. In other words, a
and b
have the same “remainder” when divided by n
. For example, 121 ≡ 16 mod 7
since both values have a remainder of 2 when divided by 7.
With this congruence relationship, we’re able to come up with some pretty interesting properties of modular arithmetic. For example, a ≡ a mod n
since a
and a
have the same “remainder” when divided by n
. We call this the reflexive property.
In addition to the reflexive property, there is also the symmetric property which states that a ≡ b mod n
is equivalent to b ≡ a mod n
. Finally, we have the transitive property which states that if a ≡ b mod n
and b ≡ c mod n
then a ≡ c mod n
.
Together, these three properties demonstrate that congruence modulo n is an equivalence relation. We can then use this relationship to begin grouping values that are congruent into sets: [a]n = {x ∈ Z : x ≡ a mod n}. For example, modulo 2 creates two sets of numbers: evens ([0]2) and odds ([1]2). These sets are called residue classes where a residue can be thought of as another word for remainder.
Groups
Unfortunately, there’s still quite a bit of number theory to slog through before we can really dig into the encryption algorithm. For instance, it’s important to explore the concept of groups.
At a high level, a group (G
) is a set in which a binary operator (*
) can be used to combine two elements into a third element. However, the relationship between the three elements must follow four conditions: closure, associativity, identity, and inverse.
- Closure:
∀x,y ∈ G, x*y ∈ G
(in words, for all x and y in G, the result of x * y is also in G) - Associativity:
x*(y*z) = (x*y)*z
- Identity:
∃e ∈ G s.t. ∀x ∈ G, e*x = x*e = x
(in words, there exists an element e in G such that for every element x the equation holds) - Inverse:
∀x ∈ G, ∃y ∈ G s.t. x * y = y * x = e
(in words, for every x in G, there exists a y in G such that performing the binary operation between the elements produces the identity element)
One example of a group is the set of all integers and the addition operator or (Z, +). Likewise, the same can be said for the set of all rational numbers, (Q, +), and the set of all real numbers, (R, +).
Residue Class Groups
With what we know about residue classes and groups, we can start to define groups of residue classes like the one bound by addition, (Zn, +). Specifically, Zn is defined as a set containing all the residue classes modulo n (i.e. {[0], [1], [2], ..., [n - 1]}
).
First, however, we should define a few operations for residue classes. As it turns out, residue classes have a simple property: they can be added and multiplied directly. In particular, if x ∈ [a], y ∈ [b]
, then x + y ∈ [a + b]
and x ⋅ y ∈ [a ⋅ b]
.
Now, is addition enough to constitute a group? As it turns out, yes. After all, it checks all four boxes:
- Closure: integer addition already passes this criteria
- Associativity: ditto!
- Identity: like integer addition, 0 is our identity element
- Inverse: again, integer addition defines -a as the inverse
Unfortunately, (Zn, ⋅ ) is not a group since 0-1 does not exist. To make matters worse, some inverses don’t exist even when the set of integers is strictly positive. In particular, we define the inverse such that for a ∈ Zn, a-1 exists iff gcd(a, n) = 1 where gcd is the greatest common divisor. In other words, a must be relatively prime to n.
Naturally, the next step would be to define a new set of residue classes which only contains relative primes to n. We call this set Zn* and it is defined as follows: {a ∈ Zn : gcd(a, n) = 1}. An example of this set would be Z12* which contains 1, 5, 7, and 11—all the relative primes to n between 0 and 11.
Now, is this new set enough to form a group with multiplication? Once again, yes! In particular, the associativity and identity properties follow multiplication. In addition, we can compute the inverse using the Extended Euclidean Algorithm. It’s the closure property which I find the most interesting. Somehow, a ⋅ b (mod n) is always in the set.
Cardinality
Next, an interesting property of sets and groups is cardinality: the size or number of elements (|S| where S is some set). With a typical set of modulo n residue classes, we have exactly n elements. How do we go about measuring the cardinality of our set of relative prime residue classes?
As it turns out, the cardinality of our set Zn* can be measured using Euler’s Totient function. There are two ways the function is used:
- φ(pe) = (p – 1)p(e-1) for prime p
- φ(ab) = φ(a)φ(b) if gcd(a,b) = 1
Knowing this fact, we can actually compute the cardinality for any n in Zn*. For example, let’s say n is 15. Then, φ(15) = φ(5)φ(3) = 4 ⋅ 2 = 8. Now, let’s see if we can list them all: {1, 2, 4, 7, 8, 11, 13, 14}.
Interestingly, there are a few properties that come out of knowing the cardinality of Zn*. For example, Langrange’s theorem shows that for any a in our group, a to the power of the cardinality of that group is equal to the identity of that group (a ∈ G, a|G| = e). From the example above:
28 mod 15 = 1.
As a corollary to that property, a to the power of some m is equal to a to the power of m mod the cardinality of G (a ∈ G, am = am mod |G|). Again, using the example above:
257 = 257 mod 8 = 21 = 2.
On top of that, Euler’s theorem states that for any a in Zn*, a to the power of the cardinality of Zn* is 1 (a ∈ Zn*, aφ(n) = 1). Using our example from above again:
78 mod 15 = 1.
Finally, Fermat’s Littler Theorem states that if a is in Zp* where p is a prime, then a to the cardinality of Zp* is equal to a to the power of p minus 1 or 1 (a ∈ Zp*, aφ(p) = ap-1 = 1). Using a new example where p is 11:
7φ(11) = 710 = 1.
RSA Encryption Algorithm
Given our new background in number theory, the RSA Encryption algorithm should be pretty straightforward.
Step 1: Choose Large Primes
To start, the first thing we want to do is pick two very large primes (>= 2048 bits). We want to do this because prime factorization is a very difficult task. While it’s extremely easy to multiply two primes to create a composite value, it’s much more difficult to figure out which two primes created that composite value (aka a one-way function).
Unfortunately, finding large prime numbers is not a trivial task. To do so, we generally use some form of guess-and-check method. In other words, we generate some large number of our desired length and test if it’s prime.
To test if a value is prime, we can brute force divide the value by all numbers between 2 and the square root of the value. Of course, this process is slow for the types of large values we’d like to test, so it would be nice if there were a better method.
Enter: the Fermat Test. Previously, we stated that if n is prime, then for any a: an-1 = 1 (mod n). In other words, pick a random a between 1 and n-1 and solve the equation. If the result is 1, we probably have a prime. However, there are composite values which pass this test, and they’re known as Carmichael Numbers.
To improve on the Fermat Test, the Miller-Rabin Test was born. In addition to computing the Fermat Test, the Miller-Rabin Test adds an additional probabilistic test which further rules out Carmichael Numbers. In the future, I might dig into that algorithm much deeper.
At any rate, with the two prime numbers (p and q) picked, we’ll want to compute n as the product of p and q.
Step 2: Compute the Encryption Keys
From n, we need to select the first e ncryption key, e. We do this by selecting a value between 1 and the cardinality of Zn* (aka φ(n)). As an additional criteria, e must be coprime to φ(n).
As an example, consider two small prime numbers: 13 and 23. In this case, n is p ⋅ q = 13 ⋅ 23 = 299. From there, φ(n) is simple to compute: (p – 1)(q – 1) = (12)(22) = 264. In other words, we need to select an e between 1 and 264 that is also coprime to 264. For the sake of simplicity, we’ll select some small prime number that doesn’t divide into 264. How about 19?
With e, we can compute the d ecryption key, d , as follows: d = e-1 mod φ(n). In other words, ed ≡ 1 mod φ(n). To do this, we can use Euclid’s Extended Algorithm, but for simplicity let’s use this Modular Multiplicative Inverse calculator. In this case, d = 139.
Step 3: Profit
With our two constant, we can begin encrypting and decrypting messages. To do that, we need to distribute our public key. By convention, we’ll use d in combination with n (139, 299) as our private key and e in combination with n (19, 299) as our public key.
From there, if someone wants to send us an encrypted message, they’ll take our public key and encrypt their message with it. For simplicity, they might take each character in the message and convert it to its ASCII value (m). Then, they’ll apply the public key to that value as follows: c = me mod n. When we decrypt that value, we’ll use the same function except e will be replaced by d (since they’re inverses): m = cd mod n.
Since we distribute n alongside e, we have to ensure that n is sufficiently large. Otherwise, someone can easily reduce n to its prime factors and compute d. As of today, the industry standard requires an n greater than 2048 bits.
Want to Learn More?
In the process of writing this article, I realized that I wasn’t going to be tested on it, so I stopped putting so much effort into it. As a result, there may be concepts you still want to explore like primality testing. Likewise, you probably want to see more concrete examples. Unfortunately, at this time, I won’t be able to provide any additional support.
However, if there’s enough interest, I may expand this article or write an additional article clarifying anything you’d like to know. As someone who struggles with the math side of Computer Science, I’m not sure how much help I can be, but I’m always willing to try!
In the meantime, help me make this feel like less of a waste of time by subscribing either as a member or through the email list.
If books aren’t your thing, but you’re interested in topics related to RSA Encryption, then check out some the following articles:
Thanks again for your support!
The post Understanding the Number Theory Behind RSA Encryption appeared first on The Renegade Coder.