Primer on Public Key Encryption


[A Web-only sidebar to "Homeland Insecurity," (September, 2002 Atlantic Monthly) by Charles C. Mann]

Public-key encryption, as noted in the profile of cryptographer Bruce Schneier, is complicated in detail but simple in outline. The article below is an outline of the principles of the most common variant of public-key cryptography, which is known as RSA, after the initials of its three inventors; a mathematically detailed explanation of RSA by the programmer Brian Raiter, understandable to anyone willing to spend a little time with paper and pencil, is available here.

A few terms first: cryptology, the study of codes and ciphers, is the union of cryptography (codemaking) and cryptanalysis (codebreaking). These ciphers have been used for years, and were easy to hide in all types of texts, most commonly you see them in spy movies. To cryptologists, codes and ciphers are not the same thing. Codes are lists of prearranged substitutes for letters, words, or phrases i.e. "meet at the theater" for "fly to Chicago" or "stop and smell the flowers" for "pick me up at the mall". Ciphers employ mathematical procedures called algorithms to transform messages into unreadable jumbles. Most cryptographic algorithms use keys, which are mathematical values that plug into the algorithm. If the algorithm says to encipher a message by replacing each letter with its numerical equivalent (A = 1, B = 2, and so on) and then multiplying the results by some number X, X represents the key to the algorithm. If the key is 5, "attack," for example, turns into "5 100 100 5 15 55." With a key of 6, it becomes "6 120 120 6 18 66." (Nobody would actually use this cipher, though; all the resulting numbers are divisible by the key, which gives it away.) Cipher algorithms and cipher keys are like door locks and door keys. All the locks from a given company may work in the same way, but all the keys will be different.

Public-key cryptography is often said to be important because messages enciphered by it are "unbreakable", that is, people can't randomly try out possible keys and break the cipher, even with powerful computers that try thousands of keys a second. (This assumes that the key has been properly chosen; even the best algorithm will be compromised if the key is something easily guessable.) In fact, though, many types of crypto algorithms are effectively unbreakable. What public-key does—its significant innovation—is to simplify drastically the problem of controlling the keys.

In non-public-key crypto systems, controlling the keys is a constant source of trouble. Cryptographic textbooks usually illustrate the difficulty by referring to three mythical people named Alice, Bob, and Eve. In these examples, Alice spends her days sending secret messages to Bob; Eve, as her name indicates, tries to eavesdrop on those messages by obtaining the key. Because Eve might succeed at any time, the key must be changed frequently. In practice this cannot be easily accomplished. When Alice sends a new key to Bob, she must ensure that Eve doesn't read the message and thus learn the new key. The obvious way to prevent eavesdropping is to use the old key (the key that Alice wants to replace) to encrypt the message containing the new key (the key that Alice wants Bob to employ in the future). But Alice can't do this if there is a chance that Eve knows the old key. Alice could rely on a special backup key that she uses only to encrypt new keys, but presumably this key, too, would need to be changed. Problems multiply when Alice wants to send messages to other people. Obviously, Alice shouldn't use the key she uses to encrypt messages to Bob to communicate with other people—she doesn't want one compromised key to reveal everything. But managing the keys for a large group is an administrative horror; a hundred-user network needs 4,950 separate keys, all of which need regular changing. In the 1980s, Schneier says, U.S. Navy ships had to store so many keys to communicate with other vessels that the paper records were loaded aboard with forklifts.

Public-key encryption makes key-management much easier. It was invented in 1976 by two Stanford mathematicians, Whitfield Diffie and Martin Hellman. Their discovery can be phrased simply: enciphering schemes should be asymmetric. For thousands of years all ciphers were symmetric—the key for encrypting a message was identical to the key for decrypting it, but used, so to speak, in reverse. To change "5 100 100 5 15 55" or "6 120 120 6 18 66" back into "attack," for instance, one simply reverses the encryption by dividing the numbers with the key, instead of multiplying them, and then replaces the numbers with their equivalent letters. Thus sender and receiver must both have the key, and must both keep it secret. The symmetry, Diffie and Hellman realized, is the origin of the key-management problem. The solution is to have an encrypting key that is different from the decrypting key—one key to encipher a message, and another, different key to decipher it. With an asymmetric cipher, Alice could send encrypted messages to Bob without providing him with a secret key. In fact, Alice could send him a secret message even if she had never before communicated with him in any way.

"If this sounds ridiculous, it should," Schneier wrote in Secrets and Lies (2001). "It sounds impossible. If you were to survey the world's cryptographers in 1975, they would all have told you it was impossible." One year later, Diffie and Hellman showed that it was possible, after all. (Later the British Secret Service revealed that it had invented these techniques before Diffie and Hellman, but kept them secret—and apparently did nothing with them.)

To be precise, Diffie and Hellman demonstrated only that public-key encryption was possible in theory. Another year passed before three MIT mathematicians Ronald L. Rivest, Adi Shamir, and Leonard M. Adleman—figured out a way to do it in the real world. At the base of the Rivest-Shamir-Adleman, or RSA, encryption scheme is the mathematical task of factoring. Factoring a number means identifying the prime numbers which, when multiplied together, produce that number. Thus 126,356 can be factored into 2 x 2 x 31 x 1,019, where 2, 31, and 1,019 are all prime. (A given number has only one set of prime factors.) Surprisingly, mathematicians regard factoring numbers—part of the elementary-school curriculum—as a fantastically difficult task. Despite the efforts of such luminaries as Fermat, Gauss, and Fibonacci, nobody has ever discovered a consistent, usable method for factoring large numbers. Instead, mathematicians try potential factors by invoking complex rules of thumb, looking for numbers that divide evenly. For big numbers the process is horribly time-consuming, even with fast computers. The largest number yet factored is 155 digits long. It took 292 computers, most of them fast workstations, more than seven months.

Note something odd. It is easy to multiply primes together. But there is no easy way to take the product and reduce it back to its original primes. In crypto jargon, this is a "trapdoor"—a function that lets you go one way easily, but not the other. Such one-way functions, of which this is perhaps the simplest example, are at the bottom of all public-key encryption. They make asymmetric ciphers possible.

To use RSA encryption, Alice first secretly chooses two prime numbers, p and q, each more than a hundred digits long. This is easier than it may sound: there are an infinite supply of prime numbers. Last year a Canadian college student found the biggest known prime: 213466917-1. It has 4,053,946 digits; typed without commas in standard 12-point type, the number would be more than ten miles long. Fortunately Alice doesn't need one nearly that big. She runs a program that randomly selects two prime numbers for her and then she multiplies them by each other, producing p x q, a still bigger number that is, naturally, not prime. This is Alice's "public key." (In fact, creating the key is more complicated than I suggest here, but not wildly so.)

As the name suggests, public keys are not secret; indeed, the Alices of this world often post them on the Internet or attach them to the bottom of their e-mail. When Bob wants to send Alice a secret message, he first converts the text of the message into a number. Perhaps, as before, he transforms "attack" into "5 100 100 5 15 55." Then he obtains Alice's public key—that is, p x q—by looking it up on a Web site or copying it from her e-mail. (Note here that Bob does not use his key to send Alice a message, as in regular encryption. Instead, he uses Alice's key.) Having found Alice's public key, he plugs it into a special algorithm invented by Rivest, Shamir, and Adleman to encrypt the message.

At this point the three mathematicians' cleverness becomes evident. Bob knows the product p x q, because Alice has displayed it on her Web site. But he almost certainly does not know p and q themselves, because they are its only factors, and factoring large numbers is effectively impossible. Yet the algorithm is constructed in such a way that to decipher the message the recipient must know both p and q individually. Because only Alice knows p and q, Bob can send secret messages to Alice without ever having to swap keys. Anyone else who wants to read the message will somehow have to factor p x q. How hard is that? Even if a team of demented government agents spent a trillion dollars on custom computers that do nothing but try random numbers, the Sun would likely go nova before they succeeded. (Rivest, Shamir, and Adleman patented their algorithm and to market it created a company, RSA Data Security, in 1983.)

In the real world, public-key encryption is practically never used to encrypt actual messages. The reason is that it requires so much computation even on computers, public-key is very slow. According to a widely cited estimate by Schneier, public-key crypto is about a thousand times slower than conventional cryptography. As a result, public-key cryptography is more often used as a solution to the key-management problem, rather than as direct cryptography. People employ public-key to distribute regular, symmetric keys, which are then used to encrypt and decrypt actual messages. In other words, Alice and Bob send each other their public keys. Alice generates a symmetric key that she will only use for a short time (usually, in the trade, called a session key), encrypts it with Bob's public key, and sends it to Bob, who decrypts it with his private key. Now that Alice and Bob both have the session key, they can exchange messages. When Alice wants to begin a new round of messages, she creates another session key. Systems that use both symmetric and public-key cryptography are called hybrid, and almost every available public-key system, such as PGP, is a hybrid.

Solving the key problem, one should note, didn't make encryption easy for novices it made encryption easier for experts. In 1999 a Carnegie Mellon doctoral student named Alma Whitten asked twelve experienced computer users to send and receive five encrypted e-mail messages apiece with PGP. One couldn't manage it at all; three accidentally sent unencrypted messages; seven created them with the wrong key; two had so much difficulty with the other tasks that they never bothered to send out the public, encrypting half of their keys; two who received properly encrypted messages tried to decrypt their decryption key, rather than the messages. Whitten called her report, cowritten with J. D. Tygar of the University of California at Berkeley, "Why Johnny Can't Encrypt."

Indeed, as mentioned in the profile, Johnny not only can't encrypt, he doesn't encrypt. Fascinating as a mathematical exercise, public-key encryption has yet to make much difference in people's lives. (The Atlantic Monthly)