Cryptography: A Primer for the Non-Technical–Part II
If you read last week’s post you should have a baseline understanding of the science cryptography. You should know that the basics of cryptography rely on the input of a secret key along with a message, both of which are fed into a mathematical equation that applies substitution and transposition to encrypt and/or decrypt a message. If you don’t please read last week’s post before continuing.
With an overview of cryptography covered we can move on to some of the basics, and once the basics are spelled out then we can finally cover how cryptography is applied in the real world. My hope is that by the end of this post, you will be able to understand the options and opportunities available to technical staff to help carry out mandates of information protection.
Type of Encryption
There are three basic ways in which cryptographic techniques are applied: symmetric, asymmetric and hashing.
Symmetric encryption is the easiest to explain. In a symmetric encryption configuration both the sender and receive have the same secret key. The sender uses the key to encrypt the message and the recipient uses the same key to decrypt it. Anyone trying to intercept the message can’t open it without the key. Symmetric encryption is a simple and efficient method for transmitting data securely, but it does have it’s limitations.
The major drawback with symmetric encryption is key management. Firstly, how can you securely send the encryption key to the recipient? Via email? Through regular mail on a USB drive? Neither of those methods are very secure, and if you’re willing to send the key in an insecure manner why bother to encrypt your message at all.
The other key management issue with symmetric encryption is keeping track of a potentially large number of keys. If you require sending encrypted communications to one other person you only require one key between you. If a group of three are communicating securely with each other you need six keys (each participant needs a pair of keys to communicate secretly with the other two – 32). If there are 50 people communicating amongst themselves you need 2,500 keys (502). You can see how that can get difficult to manage in a hurry.
To solve the key exchange and management difficulties of employing symmetric encryption cryptographers utlize asymmetric encryption. Asymmetric encryption is mind-blowingly cool. In an asymmetric encryption configuration the sender and the recipient have different keys. Each key can either decrypt a message sent by the key of the matching pair or encrypt a message that can only be read by the matching key. One key encrypts the message, but can’t decrypt it while the other of the pair decrypts the message but can’t encrypt it. This feature of having a pair of keys that each works one way but not the other on a particular message can be used in a number of applications.
Typically the owner of the asymmetric keys will make one key public (available to anyone) and keep the other key private. Using the public key anyone can send the key owner an encrypted message that only the owner can read. Or the owner can encrypt a message and anyone could use the public key to decrypt and read. The second scenerio can be confusing: Why would someone want to encrypt a message that anyone could read? Well, if you receive a message from me that can be decrypted with my public key that provides some measure of assurance that the message came from me and not an impostor.
You might be asking yourself why would anyone use symmetric algorithms when the asymmetric algorithms solve the big problems associated with it. The problem with the asymmetric algorithms is that they are slow, or rather they are much more computationally intensive. As a result they are generally used for small messages only.
The last type of encryption algorithm I’m going to discuss is the one-way hash. A one-way hash is a number calculated from the contents of a message which uniquely identifyies that message. The resulting number provides a unique fingerprint for a message. It can be used to ensure the integrity of a file or hide the original message while retaining the ability to affirm its contents.
Applying Symmetric, Asymmetric and Hash
If you’ve read this far, and I’ve done an adequate job explaining the foundations of cryptography, you should understand the difference between symmetric, asymmetric and hash algorithms. And if you understand that much we can get to work elaborating on how it applies to the privacy professional.
The privacy professional is generally concerned with the collection, use and storage of personally identifiable information. Each one of those verbs in the previous sentence can be secured with the use of cryptography. I’m going to try break it down into that context.
Transmitting information securely
When a data processor is collecting information from a data subject it is the data processor’s responsibility to ensure the communications channel through which that information is collected is secure. Generally this means that if a company is requesting sensitive information, such as a credit card number, they should be doing it with adequate security in mind.
On the Internet today that means utilizing transport security layer (TLS) also commonly known as secure sockets layer (SSL). TLS is pretty simple for the end-user: If they see that the URL begins with “https” and their browser is indicating the connection is secure and valid (e.g. a green bar, a lock icon, etc.) then they have some assurance that the data they transmit will be encrypted during that transmission.
The actual implementation of TLS is fairly complex however. The protocol uses both symmetric and asymmetric algorithms. Recall that we have already discussed the problem of keeping the key exchange in a symmetric setup secret? And also recall that asymmetric algorithms resolve that problem but are really slow. Those are the reasons why we use both.
For the TLS protocol the system uses an asymmetric algorithm for the key exchange, then the much faster symmetric algorithm for the rest of the communications. It’s really very clever. The user’s browser requests a secure connection to the server. The server responds with its public key and asks the user to create a session key, encrypt it with that public key and send it back to the server. The server then decrypts the session key with its private key, and now both the server and user’s browser can use that session key.
With the session key (a.k.a. secret key) available at both ends, the server finally responds with the requested page encrypted with a symmetric algorithm and the session key. The user can respond the same way: Encrypt with the session key and send a response back to the server. In this manner communications are kept secret from prying eyes.
Note: I’ve simplified things a bit in a the negotiation of a TLS exchange by intentionally skipping over some authentication steps because I don’t feel they aren’t necessary for the privacy professional to understand, so keep in mind there is a bit more to it than what I have covered.
This basic scheme using an asymmetric algorithm for key creation and key exchange, then a separate symmetric algorithm for all other communication is used in other transmission forms as well. This is the basic setup for a corporate VPN and encrypted mobile communications as well as many methods for authenticating logins.
Storing information securely
It’s worth stating, even if this post isn’t about best practices, that you should only store the personally identifiable data you need. Utilize transient storage methods when possible. Generally however we have the need for long-term storage of personal data and therefore the ability to encrypt data when you store it has obvious benefits. There are different two distinct methods I’ll cover that take advantage of encryption to protect data in storage.
The first encryption method is the most common. We can use a symmetric algorithm to encrypt the data, then store it. This ensures that anyone who doesn’t have the key but can somehow retrieve the data can’t read it. This is simple, efficient and certainly a popular use.
The next method utilizes the one-way hash. This is the preferred method for storing passwords, but should be considered for other purposes as well. As I mentioned earlier what is unique about hashing a message is that for all practical purposes it can’t be decrypted. And in that fact lies the power of the hash. If you can’t recover the original text then you have successfully employed a technical control that prohibits anyone from retrieving the original text with the data you’ve stored.
This is why it is perfect for passwords. It should also be considered for any data that the user provides that they need may to affirm to contents of, but you have no reason to ever read. If there is no reason for a company to know what the contents are of a message, such as a user’s password, then a hash can be considered. A hash is a fantastic means of enforcing a policy via technical controls. Regardless of the authorization level of an employee if the data is hashed it can’t be misread or misused.
A hash can also be used to ensure the integrity of a message. If you have a hash of a given message, then you can be assured that the contents of a new message is exactly the same if the hash is the same.
Authentication
There are three ways you can authenticate a user. You can authentication with something they know (a password), something they are (a fingerprint) or something the person has. Encryption participates in managing all three techniques but it’s that last one we haven’t covered yet.
There are a number of ways cryptography can authenticate, ranging from encrypted keys stored on fobs that unlock doors to smart cards but for the contents of this post I’m going to only cover digital signatures. When a user digitally signs a document what they are doing is encrypting a message with their private key.
This allows anyone with the public key of the pair to open it, but since the only way to encrypt that message for that public key is with the single private key, the recipient can be assured of the message’s origin. Simple and effective as long as you can be sure the private key isn’t compromised.
And that leads to the end of the post and the most important lesson among all that I’ve discussed: Cryptography is only as good as the key management. If a secret key is compromised or improperly created the cryptography is not quite worthless but not far from it. It is important to keep that in mind as often times it is assumed that if something is encrypted that it is safe. You can not take that for granted.
Having read and understood all that I’ve covered you are now an expert in cryptography, right? Of course not, the intent of this article was to provide background and a base understanding, it is not even a basic a how to article, but please do use this primer to help in planning and facilitate communications between departments. At the very least I hope this will help you better understand options for using cryptography to safely collect, store and use personal information.