Pretty Good Privacy (PGP) and Digital Signatures

If you have sent any plaintext confidential emails to someone (most likely you did), have you ever questioned yourself about the mail being tampered with or read by anyone during transit? If not, you should!

Any unencrypted email is like a postcard. It can be seen by anyone (crackers/security hackers, corporations, governments, or anyone with the required skills), during its transit.

In 1991 Phil Zimmermann, a free speech activist, and anti-nuclear pacifist developed Pretty Good Privacy (PGP), the first software available to the general public that utilized RSA (a public key cryptosystem, will discuss it later) for email encryption and signing. Zimmermann, after having had a friend post the program on the worldwide Usenet, got prosecuted by the U.S. government; later he was charged by the FBI for illegal weapon export because encryption tools were considered as such (all charges were eventually dropped). Zimmermann later founded PGP Inc., which is now part of Symantec Corporation.

In 1997 PGP Inc. submitted a standardization proposal to the Internet Engineering Task Force. The standard was called OpenPGP and was defined in 1998 in the IETF document RFC 2440. The latest version of the OpenPGP standard is described in RFC 4880, published in 2007.

Nowadays there are many OpenPGP-compliant products: the most widespread is probably GnuPG (GNU Privacy Guard, or GPG for short) which has been developed since 1999 by Werner Koch. GnuPG is free, open-source, and available for several platforms. It is a command-line only tool.

PGP is used for digital signature, encryption (and decrypting obviously, nobody will use software which only encrypts!), compression, Radix-64 conversion.

In this article, we will explain encryption and digital signatures.

So what encryption is, how does it work, and how does it benefit us?

Encryption (Confidentiality)

Encryption is the process of conversion of any information to a ciphertext or an unreadable form. A very simple example of encrypting text is:

Hello this is Knownymous and this is a ciphertext.

Uryyb guvf vf Xabjalzbhf naq guvf vf n pvcuregrkg.

If you read it carefully, you will notice that every letter of the English alphabet is converted to its next 13th letter in the English alphabet, so 13 is the key here, needed to decrypt it. It was known as Caesar cipher (Yes, the method is named after Julius Caesar).

Since then there are many encryption techniques (Cryptography) developed like- Diffie–Hellman key exchange (DH), RSA.

The techniques can be used in two ways:

1. Symmetric-key algorithm

In the symmetric-key algorithm, the plaintext is encrypted using a key which is then used to decrypt it. It is a lock and key mechanism. A single key (or the exact copy) is used to lock (encrypt) and open (decrypt) the locker (ciphertext). But it has a drawback, as to be able to use this method both the sender (let it be denoted S) and receiver (let it be R) will be needing the key. So if S has to send any ciphertext it needs to send the key also so that R will be able to decrypt it. S can send it physically (No, not feasible for long distances), or can send the key with the message (Not a good idea. will you keep a key and lock together?).

Another problem with the symmetric-key besides transferring the key is that the user will have to send copies of different keys to all the different users and keep the track of each key to make future connections with each of them.

So what are the use cases of this algorithm?

While symmetric encryption is an older method of encryption, it is faster and more efficient than the other encryption(will be coming to it in a while), which takes a toll on networks due to performance issues with data size and heavy CPU use. Due to the better performance and faster speed of symmetric encryption, symmetric cryptography is typically used for bulk encryption/encrypting large amounts of data, e.g. for database encryption. In the case of a database, the secret key might only be available to the database itself to encrypt or decrypt, so this is a good option. Also, Symmetric ciphers are commonly used to achieve other cryptographic primitives than just encryption. Encrypting a message does not guarantee that the message is not changed while encrypted. Hence often a message authentication code is added to a ciphertext to ensure that changes to the ciphertext will be noted by the receiver. Message authentication codes can be constructed from symmetric ciphers.

The same algorithm was employed extensively by Nazi Germany during World War II, in all branches of the German military using a special encryption device- Enigma Machine.

To overcome the drawbacks of symmetric-key, we have another encryption:

2. Asymmetric cryptography or Public-key encryption

Public key cryptography was first discovered by James Ellis, Clifford Cocks, and Malcolm Williamson of the British GCHQ (Government Communications Headquarters) in 1975, but the discovery was filed as classified information and never divulged. In 1976 researchers Whitfield Diffie, Martin Hellman, and Ralph Merkle independently made the same discovery and published it in a paper. Then in 1977 Ronald Rivest, Adi Shamir, and Leonard Adleman provided the first practical implementation of a public key cryptography algorithm by developing the RSA cipher.

It uses two keys: One for encryption (public key) and the other for decryption (private key).

Here is a picture to explain this:

RSA Cipher

Symmetric-key algorithm

Here A is the sender and B, C, D are Receivers. As you can see A will communicate using key 1 with B, using key 2 with C, and using key 3 with D and will also need to keep a record of which key was used for which lock, and as the number of receivers increases, more will be the keys and more difficult it will be to keep track.

Now in the Asymmetric key, there are only two keys. Let’s understand this with the figure below:

Asymmetric key

Asymmetric key algorithm (here B, C, D are senders and A is a receiver)

Here A generates two keys (1 and 2 here). A keeps one of the keys secret (here 1 which is inside the locker) and distributes the other key publicly (no need to hide this, or better say, spread this) on the web, using social media or any online platform (there are public key servers for this purpose) with B, C, D.

Now B, C, D have the key 2 of A. So now they will put their message inside the locker and lock (encrypt) it using the key 2 and will send it to A and A will use the key 1, which is inside the locker (it can be opened using a passphrase or password set by A) to open it.

The key 1 is called the Private key which is used to decrypt the message and the key 2 is the Public key, used for encryption.

So if you want to send an encrypted email, you will need the receiver’s public key. Here, there is no chance of the data being watched by any other person as long as the Private key is not compromised.

Digital Signature (Authentication)

Public key cryptography is not only used for confidentiality (i.e. to protect the message so that it can be read-only by the intended recipient), but also for authentication (i.e. to verify that the message comes from the intended sender) and integrity (i.e. to ensure that the message has not been altered in transit). Authentication and integrity are enforced by appending a digital signature to the message.

A digital signature is a string of bits generated by an algorithm that uses a hash function in conjunction with a key. A hash function is a function that takes in input a message of any length and outputs a string of fixed small length called digest which is a distillate of the message fed in the input. Notable features of hash functions include that it should practically be impossible to derive the input from the output, and that changing just one bit of the input results in a completely different output.

The digital signature uses a hash code or message digest algorithm, and a public-key signature algorithm in the sequence as follows:

The sender creates a message.
The sending software generates a hash code of the message.
The sending software generates a signature from the hash code using the sender’s private key.
The binary signature is attached to the message.
The receiving software keeps a copy of the message’s signature.
The receiving software generates a new hash code for the received message and verifies it using the message’s signature. If the verification is successful, the message is accepted as authentic.

How digital signature works

The sender creates the hash of the plaintext message to be sent. A hashing function is an algorithm used to convert the string of characters into usually shorter (SHA-1 creates 40 characters) fixed-length value called a hash or message digest. This digest is then encrypted using the sender’s private key (remember the public key of the receiver was used for encryption) and sent along with the original plaintext message. This together along with other information like the hashing algorithm used, make up the digital signature.

Hashing is usually faster than encryption that’s why we first hash the message and then encrypt the hash.

Hashing

The receiver then decrypts the hash using the sender’s public key, which gives them a decrypted hash. Then the receiver creates the hash of the original message received. If this hash is the same as the decrypted hash received then the digital signature is said to be verified.

Now if the message is altered during its transit (even a single character) then the hash value will be completely different and the digital signature can’t be verified.

The digital signature works by using the private key for encryption and public key to decrypt (opposite of encryption) because the private key is only with the original sender (who created the public key) and no one can get that (obviously that’s why it’s private). This verifies that the original message was received from the same source.

Now there’s a question: if we are sending the plaintext with the digest, then anyone can read the message so how is the confidentiality maintained?

As we discussed earlier, the digital signature only maintains the integrity (not tampered) and the authenticity (from the same source) of the message. We use encryption for confidentiality. When used together, the message is first decrypted by the receiver (using the private key) which gives the plaintext message. This plaintext is then used by the digital signature for authentication purposes as discussed above.

References:

RFC 4800 – OpenPGP Message Format

Enigmail – Introduction to cryptography

Pretty Good Privacy (PGP) and Digital Signatures