The mantra “don’t roll your own crypto” is widely known and accepted amongst programmers, but what does it actually mean? It turns out that such a simple statement is not so simple to follow.
What many people take away from “don’t roll your own crypto” is that they shouldn’t create their own crypto algorithms. This makes sense. After all, most people wouldn’t even know where to start. So, instead of making up an algorithm when they need to encrypt data, an engineer might take on OpenSSL or BouncyCastle as a dependency and pat themselves on the back for using a well-established scheme. What they might not realize is that the algorithms themselves are the first in a series of traps, each of which can have catastrophic effects on the outcomes of cryptography use.
Algorithm selection, algorithm use, and protocol creation are all potential pitfalls that await once you’ve decided not to create your own algorithm. We’ll briefly explain examples of each of these.
There are a massive number of existing algorithms that do a wide range of things. The encryption space, for example, can first be broken down into symmetric and asymmetric. Each of those categories has a number of usable algorithms, and many of those algorithms have some other number of usable modes.
Let’s look at the range of choices someone might face if they want to do symmetric encryption.
AES is a good algorithm, but at this point, you’ve likely seen the Wikipedia image explaining why you shouldn’t use the ECB mode. That’s great! You could pick CBC or CTR, but those aren’t authenticated. The data you send with those modes would be confidential, but someone else might be able to pose as you and send different messages. If you need authentication, you could look to GCM or OCB, but OCB has a history of patents, so there are fewer vetted implementations. That leaves you with GCM, which can be problematic for reasons we’ll discuss in the next section.
Whew.
Even worse, it’s hard to tell if you’ve made a bad choice! If you just watched a few blocks of ECB on the wire, they’d probably look random enough. If you encrypted JPG instead of a bitmap image, you’d likely not even be able to see the pattern, but the risk would remain. You would experience a similar challenge if you picked a mode that does not provide authentication, but your security requirements depend on that property. No number of test cases or amount of system observation would alert you to the mistake.
A number of algorithms are secure if used properly but can be disastrous if used incorrectly. AES-GCM, for example, requires a nonce (a single-use value) as an additional input. A single nonce reuse can reveal the plaintexts and a significant amount of information about the key. Nonce reuse can be observed, to some extent, but can also be a low-probability event that results from subtleties in state management or concurrent programming.
Other algorithms have even less obvious modes of misuse. Take the example of hashing passwords for storage. SHA2 might seem like a good choice, after all, it is a good hash. Unfortunately, while SHA2 is a good hash, it is designed for speed, which is the opposite of what you want for password hashing. A better selection is Argon2 or bcrypt. Once an appropriate algorithm is selected, it’s still essential that passwords are salted before they are hashed. Otherwise, attackers can pre-compute hashes and quickly reverse large numbers of passwords if the password database is ever compromised.
So you’ve chosen a secure algorithm in a secure mode that fits your operational environment. Now you need an implementation, no problem! You can pull one off the shelf from a highly trusted provider such as OpenSSL, BouncyCastle, or a managed Hardware Security Module (HSM) service. You’re not done there, though. This is where things start getting hard.
Unless you are encrypting data at rest, your crypto is probably going to be deployed inside a protocol scheme that enables secure communication. When it comes to protocols, there is rarely an off-the-shelf solution. As a result, developers are often forced to implement their own protocols, even though they understand it is risky.
Most protocol designs are variants of a few well-known schemes. For example, Diffie-Helman is the foundation for many key exchange protocols, and secure channels are often established through some variant of SSL, IPSec or SSH. But these protocols are not one-size-fits-all. They frequently need to be modified to meet the constraints of a particular operating environment. What’s often overlooked is that even small deviations to the designs can completely invalidate a security argument. What’s more, the threat model under which the original protocol was designed may not be valid for the environment in which it is deployed.
A perfect example of this comes with the Bluetooth protocol. Bluetooth uses a relatively standard set of cryptographic primitives and protocol constructs to pair and authenticate devices. Despite the fact that the Bluetooth protocols were developed under the scrutiny of a standards committee, we see a host of new damaging flaws emerge every year. Recent examples of man in the middle and cryptographic strength downgrade attacks are just the tip of the iceberg. The attack types aren’t new, yet they re-emerge every time a new protocol gets adopted.
For most applications, it’s easy to avoid rolling your own cryptographic algorithms. Making good decisions around algorithm choice, algorithm use, and (especially) protocol design is much more difficult.
Our advice at Galois is to always use the biggest pre-built building blocks possible that meet your needs. For primitives, you can consider using the highest level interfaces of a library such as NaCl. For protocols, see if something such as an existing TLS implementation will meet your needs. In general, something widely used and with fewer configuration choices will be harder to misuse than something highly configurable.
Unfortunately, this answer only gets us so far. Time, speed, maintenance, and legacy demands can all get in the way of sticking to the safest path for cryptography implementations. If you absolutely must roll your own cryptographic design, our advice is to move slowly, audit extensively, and presume that your system contains security bugs.
If you’re wondering where to start, please reach out. We’d love to talk with you to help you understand where your cryptography might be exposing you to risk and what you can do about it.
Thanks: Christoff Elce for pointing out an inaccuracy about OCB
Reddit user /u/ScottContini for pointing out a serious mistake describing SHA2 as acceptable for password hashing.