Read The Numbers Behind NUMB3RS Online
Authors: Keith Devlin
It turns out that it is relatively easy to write a computer program to find large prime numbers, say, on the order of 150 digits. It is also easy to multiply two such large primes together to produce a single (composite) number of around 300 digits or more. But factoring a number of that size into its component primes is not at all easy, and indeed, to all intents and purposes, is impossible. (More precisely, it would take the fastest computer many decades, or even centuries, to find the factors.) The public key system based on this idea is called the RSA system, after the initials of the three inventors. The success of the method led to the establishment of a commercial company specializing in data security, RSA Data Security, Inc., based in Redwood City, California.
The secret deciphering key used in the RSA method consists essentially of two large prime numbers chosen by the user. (Chosen with the aid of a computerânot taken from any published list of primes, which an enemy might have access to!) The public enciphering key is the product of these two primes. Since there is no known fast method of factoring large numbers, it is practically impossible to recover the deciphering key from the public enciphering key. Message encryption corresponds to multiplication of two large primes (an easy computational task), decryption to the opposite process of factoring (a hard computational task).
We should point out that the encryption is not actually achieved by multiplying primes, nor is decryption carried out by factoring. Rather, that is how the keys are generated. That term “corresponds to” in the above description should be read very loosely. While encryption and decryption are not merely multiplication and factoring, the RSA system is, however, arithmetical. The message is first translated into numerical form, and the encryption and decryption processes consist of fairly simple arithmetical operations performed on numbers.
Clearly, then, the security of the RSA system, and accordingly of the many international data networks that use it, relies upon the
inability
of mathematicians to find an efficient method of factoring large numbers.
As you might expect, with so much at stake, the widespread use of the RSA system has spurred a considerable amount of research into the problems of finding primes and of factoring large numbers.
The obvious way to determine whether a number N is prime is to see if any smaller number divides it. A few moments' thought shows that you need only check to see if any number below or equal to ?N divides N. If N is fairly small, say three or four digits, this is feasible by hand; with a standard desktop PC, you could handle numbers with more digits. But the task becomes impractical when N has, say, fifty digits or more. However, there are other ways to check if a number N is prime, which do not require a blind search through all possible factors up to ?N
The methods actually used to test primality are all beyond the scope of this book, but a simple example will show how you can determine that a number is prime without having to look at and eliminate all possible factors. The example comes from the work of the great French mathematician Pierre de Fermat (1601â65).
Though only an “amateur” mathematician (he was a jurist by profession), Fermat produced some of the cleverest results mathematics has ever seen to this day. One of his observations was that if
p
is a prime number, then for any number
a
less than
p
, the number
a
pâ1
â 1 is divisible by
p
. For instance, suppose we take
p
= 7 and
a
= 2. Then
Â
a
pâ1
â 1 = 2
7â1
â 1 = 2
6
â 1 = 64 â 1 = 63
Â
and indeed 63 is divisible by 7. Try it yourself for any values of
p
(prime) and
a
(less than
p
). The result is always the same.
So here is a possible way of testing if a number
n
is prime or not. Compute the number 2
nâ1
â 1. See if
n
divides it. If it does not, then
n
cannot be prime. (Because if
n
was prime, then by Fermat's observation you would have divisibility of 2
nâ1
â 1 by
n
.) But what can you conclude if you find that
n
does divide 2
nâ1
â 1? Not, unfortunately, that
n
has to be prime. (Though this is quite likely to be the case.) The trouble is, while Fermat's result tells us that
n
divides 2
nâ1
â 1 whenever
n
is prime, it does not say that there are no composite numbers with the same property. (It is like saying that all motor cars have wheels; this does not prevent other things having wheelsâbicycles, for instance.) And in fact there are nonprimes which do have the Fermat property. The smallest one is 341, which is not prime, as it is the product of 11 and 31. If you were to check (on a computer) you would find that 341 does divide 2
340
â 1. (We shall see in a moment that there is no need to calculate 2
340
in making this check.) Composite numbers that behave like primes as far as the Fermat property is concerned are called pseudoprimes. So if, when you test for primality using the Fermat result, you discover that
n
does divide 2
nâ1
â 1, then all you can conclude is that either
n
is prime or else it is pseudoprime. (In this case the odds are heavily in favor of
n
actually being prime. For though there are in fact an infinity of pseudoprimes, they occur much less frequently than the real primes. For instance there are only two such numbers under 1,000, and only 245 below one million.)
In using the above test, it is not necessary to calculate the number 2
nâ1
, a number which will be very large for even quite modest values of
n
. You only need to find out whether or not
n
divides 2
nâ1
â 1. This means that multiples of
n
may be ignored
at any stage of the calculation
. To put it another way, what has to be calculated is the remainder that
would
be left if 2
nâ1
â 1 was divided by
n
. The aim is to see whether or not this remainder is zero, but since multiples of
n
will obviously not affect the remainder, they may be ignored. Mathematicians (and computer programmers) have a standard way of denoting remainders: the remainder left when
A
is divided by
B
is written as
Â
A mod B
Â
Thus, for example, 5 mod 2 is 1, 7 mod 4 is 3, and 8 mod 4 is 0.
As an example of the Fermat test, let us apply it to test the number 61 for primality. We need to calculate the number [2
60
â 1] mod 61, which can be written equivalently as [2
60
mod 61] â 1. If this is not zero, then 61 is not a prime. If it is zero, then 61 is either a prime or a pseudoprime (and in fact is a genuine prime, as we know already). We shall try to avoid calculating the large number 2
60
. We start with the observation that 2
6
= 64, and hence 2
6
mod 61 = 3. Then, since 2
30
= (2
6
)
5
, we get
2
30
mod 61 = (2
6
)
5
mod 61 = (3)
5
mod 61 = 243 mod 61 = 60
So,
2
60
mod 61 = (2
30
)
2
mod 61 = 60
2
mod 61 = 3,600 mod 61 = 1
Thus,
2
60
mod 61 â 1 = 0
Since the final answer here is 0, the conclusion is that 61 is either prime or pseudoprime, as we anticipated.
One of the methods professionals use to find large primes starts with the Fermat test just described and modifies the approach so it cannot be “fooled” by a pseudoprime. The reason we can't describe the method in this book is that it takes considerable effort, and some sophisticated mathematics, to circumvent the pseudoprime problem.
To date, there is no method to factor a large number that is even remotely as efficient as one of the primality testing methods, despite a considerable investment of talent and effort. Research into the problem has not been without some successes, however, and on several occasions mathematicians have come up with ingenious ways to find factors in usefully short computational time. When the RSA system was first put into use, factoring a number of around 120 digits was at the limit of what could be achieved. Improvements both in algorithm design and computer technology have since brought 120-digit numbers into the vulnerable range, so cryptographers have increased the size of RSA keys to well beyond that level. At the moment, many mathematicians believe it probably is not possible to find a method that can factor (in realistic time) numbers of 300 digits or more, so that is regarded as a safe key size.
That developments in factoring do indeed pose a genuine, if potential, threat to RSA codes was illustrated in dramatic fashion in April 1994, when a sophisticated method was used to crack a challenge problem in RSA cryptography that had been posed in 1977. The origin of the problem is itself of interest. In 1977, when Rivest, Shamir, and Adleman proposed their public-key encryption system, it was described by mathematics writer Martin Gardner in the August issue of
Scientific American
, in his popular mathematics column. There, Gardner presented a short message that had been encoded using the RSA scheme, using a 129-digit key resulting from the multiplication of two large primes. The message and the key were produced by researchers at MIT, who offered, through Gardner, $100 to the first person who managed to crack the code. The composite number that was the key to the code became known as RSAâ129. At the time, it was thought it would take more than 20,000 years to factor a 129-digit number of its kind, so the MIT group thought their money was safe. Two developments that followed were to result in the solution to the MIT challenge a mere seventeen years later.
The first was the development of so-called quadratic sieve methods for factoring large numbers. A crucial feature of these methods that was to prove significant in factoring RSAâ129 was that they effectively broke up the problem into a large number of smaller factorizationsâa process that, while still challenging, was at least feasible with a fairly fast computer. The second pivotal development was the Internet. In 1993, Paul Leyland of Oxford University, Michael Graff at Iowa State University, and Derek Atkins at MIT put out a call on the Internet for individuals to volunteer theirâand their personal computers'âtime for a massive, worldwide assault on RSAâ129. The idea was to distribute the various parts of the factorization problem yielded by the quadratic sieve method, and then sit back and wait until enough of those partial results had been found to produce a factorization of RSAâ129. (The quadratic sieve method they used did not require all of the smaller subfactorizations to be solved; just enough of them.) Some 600 volunteers, spread around the world, rose to the challenge. Over the next eight months, results came in at the rate of around 30,000 a day. By April 1994, with greater than 8 million individual results to work on, a powerful supercomputer was set the formidable task of looking for a combination of the small factorizations that would yield a factor of RSAâ129. It was a mammoth computation, but in the end it was successful. RSAâ129 was factored into two primes, one having 64 digits, the other 65. And with it, the original MIT message was decrypted. It read:
The magic words are squeamish ossifrage
. (This is a typical MIT inside joke. The ossifrage is a rare vulture having a wingspan of up to ten feet, whose name means “bone breaker.”)
DIGITAL SIGNATURES
Another security issue Whitfield and Hellman addressed in their 1976 paper was: How can a receiver of an electronic document be sure that it actually came from the source it claimed to be from? In the case of written documents, we generally rely on a signature. Public key cryptosystems provide a means for creating an electronic analog of a signatureâa digital signature, as it were. The idea is straightforward: You use the public key encryption system in reverse. If Alice wants to send Bob an electronically signed document, she encrypts it using her secret decryption key. When Bob receives the document, he uses Alice's public encryption key to decrypt the message. This will result in gibberish unless the message was encrypted using Alice's decryption key. Since only Alice knows that key, if the result is a readable document, Bob can be sure that it came from Alice.
In fact, a digital signature is a more secure form of authentication than a regular signature. Someone could always copy (either by hand or electronically) your signature from one document to another, but a digital signature is tied to the document itself. The idea of digital signatures is also used to provide digital certificates, verifications provided by a particular website that it is indeed the site it purports to be.