Adolfo Ochagavía

What the heck is AEAD again?

Here’s a problem you might be familiar with: I keep forgetting what AEAD exactly means and why you would ever use it. Yes, I know the acronym stands for “Authenticated Encryption with Associated Data”, but does that really clarify anything? Not to me, so I’ve finally decided to sit down and write this blog post as a piece of help for my future self… and for anyone else who finds AEAD hard to retain.

Why bother at all?

Simply put, AEAD encryption is the current industry standard. That sounds like a good reason to bother, at least if you care about Understanding Your Building Blocks. You don’t have to take my word for it, though. Below are some relevant data points:

The list could be longer, but this is hopefully enough to prove AEAD is here to stay. Or as Thomas Ptacek put it in his famous Cryptographic Right Answers: “[AEAD] is the only way you want to encrypt in 2015”. (Yes, that was 10 years ago.)

Part 1 - Authenticated Encryption

Authenticating what?

When I think of authentication, the association in my mind is that of logging in to a website. In cryptography, however, authentication means proving the encrypted message is authentic, i.e. that it wasn’t altered after encryption and thus originates in its entirety from someone with access to the secret key.

Authentication is not merely a “nice to have” feature, as you might initially think. It’s often a basic1 condition2 for the security of the system. In some cases, for instance, lack of authentication can let an interceptor decrypt messages even without having the secret key!

Towards sane(r) defaults

Back when I first studied cryptography3, it was common practice to perform encryption and authentication in separate steps. You would pick an encryption scheme (e.g. AES-256 in CBC mode), an authentication scheme (e.g. HMAC-SHA256), and carefully knit them together in your code to ensure everything was properly authenticated4.

The following pseudo-code shows the encryption and decryption process from those days:

# Sender: encrypt and generate authentication tag
(nonce, ciphertext) = encrypt(key, "hello world")
tag = hmac(key, nonce + ciphertext)
send(nonce, ciphertext, tag)

# Receiver: verify authentication tag and decrypt
(nonce, ciphertext, tag) = receive()
assert tag == hmac(key, nonce + ciphertext)
assert decrypt(key, nonce, ciphertext) == "hello world"

Quite a mouthful, isn’t it? Not as simple as merely calling encrypt and decrypt. No wonder people often messed up, like in the case of Apple’s iMessage vulnerability caused by… failing to include an authentication step altogether! By the way, even if you remember to authenticate, you still need to apply encryption and authentication in the right order, or the The Cryptographic Doom Principle will come for you.

Fortunately, in the last decade industry has introduced primitives that are more resistant to misuse. Have a look at the pseudo-code below (a simplified version of libsodium’s crypto_secretbox_easy functions):

# Sender: encrypt, including an authentication tag in the ciphertext
(nonce, ciphertext) = encrypt_auth(key, "hello world")
send(nonce, ciphertext).

# Receiver: verify message authenticity and decrypt
# (`decrypt` throws an exception if verification fails)
(nonce, ciphertext) = receive()
assert decrypt_auth(key, nonce, ciphertext) == "hello world"

Nice, isn’t it? Under the hood, this API is still using separate steps for encryption and authentication, but users of the API can’t mess up anymore. For someone like me, who leans heavily on an API’s design to guide me towards writing correct code, this is way better than older APIs that let you shoot yourself in the foot.

Part 2 - Associated Data

But why?

We have adopted authenticated encryption. Isn’t that enough to keep our messages secret? What’s all the “associated data” fuss about? Why the extra complexity?

Authenticated encryption is indeed enough to keep messages secret, but it turns out that you often need to send unencrypted data together with your encrypted message. That piece of unencrypted data is what cryptographers mean by “associated data”. Let me illustrate this with an example.

Imagine, for instance, you are developing a multi-user chat application. When two users engage in a conversation, they negotiate a secret key and start exchanging messages through a server. As you might expect, the server is unable to see the content of the messages, since they are encrypted. Still, when new messages get sent, the server needs access to the user id of the receiver to properly route a message to them. For that purpose, when an encrypted message is sent from the client, it also includes the unencrypted receiver’s user id. In other words, the receiver’s user id is sent as associated data of the encrypted message.

Now what happens if a man-in-the-middle intercepts the message and replaces the original receiver’s user id with a different user id? There are two possibilities:

Let’s authenticate

Similar to authenticating an encrypted array of bytes, we can use an authentication scheme (e.g. HMAC-SHA256) to authenticate an encrypted message together with its associated data. Something like:

# Sender: encrypt and send together with tagged associated data
associated_data = "an unencrypted string"
(nonce, ciphertext) = encrypt(key, "hello world")
tag = hmac(key, nonce + ciphertext + associated_data)
send(nonce, ciphertext, associated_data, tag)

# Receiver: verify encrypted and associated data, then decrypt
(nonce, ciphertext, associated_data, tag) = receive()
assert tag == hmac(key, nonce + ciphertext + associated_data)
assert decrypt(key, nonce, ciphertext) == "hello world"

Quite a mouthful again, right? In fact, this looks complex enough in my eyes that I’m not even confident it’s correct… Couldn’t cryptography libraries make our lives easier? I’d rather trust them than my own code for something like this.

AEAD to the rescue

As I mentioned above, the industry has moved to primitives that are more resistant to misuse. The same libsodium library we referred to before provides encryption functions that authenticate both the encrypted bits and the associated data. Does that sound familiar? We are finally talking about Authenticated Encryption with Associated Data!

Let’s look at it in more detail. The simplified pseudo-code below has been adapted from libsodium and illustrates AEAD usage in practice:

# Sender: encrypt, including an authentication tag in the ciphertext
# The authentication tag applies to both the encrypted bits and the unencrypted associated data.
associated_data = "an unencrypted string"
(nonce, ciphertext) = encrypt_aead(key, "hello world", associated_data)

# Receiver: verify message authenticity and decrypt
# (`decrypt` throws an exception if verification fails for the encrypted bits or the associated data)
(nonce, ciphertext, associated_data) = receive()
assert decrypt_aead(key, nonce, ciphertext, associated_data) == "hello world"

As you can see, the API now “forces” us to authenticate the encrypted bits and the associated data, preventing a wide range of mistakes. You can still introduce bugs if you try hard enough, but the API at least guides you towards the pit of success.

Part 3 - Using AEAD across libraries

What if you can’t use libsodium? Given the popularity of AEAD, multiple AEAD ciphers have been standardized, which means you can pick the one that suits you best and use it across libraries and programming languages. You might have seen names like AES256-GCM and ChaCha20-Poly1305 out there, so now comes the obvious question: which AEAD primitive should I choose?

I’m not a cryptographer, so unless I have very special requirements, I’d follow whatever Tink’s choose a primitive page recommends. Bear in mind, however, that generic cryptography advice is by definition limited. There are situations5 in which even Tink’s recommendation needs to be taken with a grain of salt. Hopefully your local cryptographer can help you out with their sage advice :)

The End

What the heck is AEAD again? I’m afraid I’ll have to go back to the beginning of this article and read it for a second time…


With special thanks to @ctz and @cpu, who reviewed an early draft of this article, suggested improvements, and verified my claims were accurate. I wouldn’t have dared publish it without their review! Any remaining mistakes are my own, obviously.


  1. This StackExchange comment puts it nicely into words: “By default, encryption should be authenticated encryption, unless you are sure that you don’t need it”. For a mere mortal like me, that means using authenticated encryption in all cases unless convinced otherwise by an expert I trust. ↩︎

  2. This article goes into some of the pitfalls and explains them in detail. At the core of the issue is the concept of malleability↩︎

  3. I was lucky to get my hands on the excellent, though now somewhat dated Cryptography Engineering: Design Principles and Practical Applications↩︎

  4. According to this report, this is the exact encryption and authentication combination used by WhatsApp to protect messages. ↩︎

  5. At the time of this writing, Tink recommends using AES128-GCM for your everyday encryption needs. Libsodium, on the other hand, warns against AES-GCM: despite being the most popular AEAD construction due to its use in TLS, safely using AES-GCM in a different context is tricky. (…) Unless you absolutely need AES-GCM, use AEGIS-256 (…) instead. Read the linked page for more info. ↩︎