Jauhar's Blog


X3DH: Making Sense of Signal’s Key Exchange

2025/07/30

Extended Triple Diffie-Hellman (X3DH) is the key-exchange algorithm used by Signal for its end-to-end encryption (E2EE) messaging app. It is a complex algorithm that comprises multiple runs of DH (Diffie–Hellman) key exchanges. In this post, I’m going to try to make sense of why such a complex algorithm is needed, why a simple DH key exchange is not enough, and why we can’t just use TCP over TLS for a secure messaging app.

The main idea of the key exchange algorithm is to enable two parties, let’s call them Alice and Bob, to agree on a single value that they will use as a key to encrypt their messages. In practice, Signal uses the Double Ratchet algorithm for its encryption; however, the details of the encryption are outside the scope of this post. This post focuses more on how Alice and Bob agree on the same secret key.

In this post, I will assume that it is possible to intercept any traffic between two parties. I’m not going to talk about how to do it, and we will just assume it can be done. I will also use the term Alice, Alice’s phone, and Alice’s device interchangeably. The same thing goes for Bob.

Strategy 1: Manual Key Exchange

Let’s start with the simplest strategy. What if we just agree on the shared secret key manually? Alice will need to meet with Bob physically, make sure that it is really Bob, and agree on the same secret key. This strategy works, but quite impractical for some people. It can be difficult for Alice and Bob to meet physically if they are very far away. If their key is leaked, they need to meet again and agree on a different key, which makes it quite difficult as well.

Apart from the inconvenience, this strategy is actually very secure if done correctly. Assuming they can authenticate each other well and guarantee nobody listen to their conversation, it is the most secure strategy compared to all of other strategies we will talk about.

Strategy 2: TCP Over TLS

A better strategy would be to use the internet to establish a shared secret key. We can use TCP over TLS to agree on the key. Alice can generate a key and send it to Bob through TCP over TLS. The problem is, Bob is not always online at the moment. Waiting until Bob is online is not better because when Bob is online, Alice might be offline. To mitigate this, we can have a server that stores Alice’s message to Bob and sends it to Bob when Bob is online. This strategy is better because Alice and Bob don’t have to physically meet. If their key is compromised, they can just create a new key in the same way. This can also be done asynchronously without both parties being online at the same time. However, now it involves a server to which we will entrust the secret key to the server. The server knows their secret key, which we don’t want.

There are other problems with strategy 2, such as authentication. How does Bob make sure that the message is really from Alice? For all he knows, the message could be from the server pretending to be Alice. There are ways to solve this problem involving digital signatures, but I will talk about it in the other strategies.

In practice, it is also impractical to use TCP since most users don’t have static IP address and thus make it difficult for the service discovery process. Most internet provider puts their users behind NAT, which also makes it impossible to establish TCP connection with them.

Clearly, we need a server to support asynchronous communication and user discovery. Let’s keep the server. Now, instead of sending the key directly through the server, let’s use DH key exchange. As you know, by using DH key exchange, two parties can agree on a shared secret key without exposing the key through their channel. What this means is, Alice and Bob can agree on a shared secret key using a server as the channel, without exposing the shared secret key to the server.

Diffie-Hellman Key Exchange

To understand this post, you need to understand how DH key exchange works. From the high level, it goes like this: each party, in this case Alice and Bob, generates a public and private key pair. The public keys, as the name suggests, are publicly exposed to anyone. Alice knows Bob’s public key, and Bob knows Alice’s public key. The private keys, as the name suggests, are only known by the owner. Alice doesn’t know Bob’s private key, and Bob doesn’t know Alice’s private key. The shared secret between Alice and Bob is calculated by combining Alice’s private key with Bob’s public key or Bob’s private key with Alice’s public key.

Diffie-Hellman Key Exchange

Alice calculates the shared secret key using her own private key combined with Bob’s public key. Bob calculates the shared secret key using his own private key combined with Alice’s public key. The DH algorithm is designed such that those calculations result in the same value. It is also practically impossible to know the private key given only the public key.

Mathematically, classic DH key exchange can be calculated this way:

$$ \begin{equation} \begin{split} S_A &= \text{Alice's private key} \\ S_B &= \text{Bob's private key} \\ P_A &= \text{Alice's public key} = \alpha^{S_A}\bmod N \\ P_B &= \text{Bob's public key} = \alpha^{S_B}\bmod N \\ \\ \alpha &= \text{agreed integer} \\ N &= \text{agreed big integer} \\ \\ D &= \text{shared secret key} \\ D_A &= P_B ^ {S_A}\bmod N \\ D_B &= P_A ^ {S_B}\bmod N \\ D &= D_A = D_B = \alpha^{S_A \times S_B}\bmod N \end{split} \end{equation} $$

Only \(P_A\) and \(P_B\) are exposed through the network. Alice can calculate \(D_A\) since she has \(P_B\) and \(S_A\). Bob can calculate \(D_B\) since he has \(P_A\) and \(S_B\). Knowing \(P_A\) and \(P_B\) is not enough to calculate \(D\) since we need to know either \(S_A\) or \(S_B\) as well. It is practically impossible to construct \(S_A\) just by knowing \(\alpha^{S_A}\bmod N\). In practice, the classic DH key exchange is rarely used. Instead, the modern DH is done using elliptic curve operations. The main idea is similar.

By using DH key exchange, the shared secret key is never exposed to the server. We only expose Alice’s public key and Bob’s public key to the server. The server can’t calculate the shared secret key just by knowing these two keys.

For simplicity, let’s define some notation. Suppose \(K_A\) is Alice’s key pair containing the public key and private key used for DH key exchange, and \(K_B\) is Bob’s key pair. \(DH\) is defined as the function that calculates the shared secret key between Alice and Bob using the DH key exchange.

$$ \begin{align*} K_A &= \text{Alice's key pair} \\ K_B &= \text{Bob's key pair} \\ S &= DH(K_A, K_B) \\ &= \text{the shared secret key between Alice and Bob} \end{align*} $$

Strategy 3: Single Diffie-Hellman

Let’s continue. Now, each user will have to generate a public and private key pair and publicly share their public key. You can imagine this public key is like their phone number or ID. If Alice wants to connect to Bob, she just needs to combine his private key with Bob’s public key, which she already knows. How does Alice know Bob’s public key? Well, it should be from another channel outside our protocol. Maybe she meets Bob somewhere and exchanges public keys. Or maybe she knows from a trusted friend.

We get the authentication feature for free here. Because only Bob knows Bob’s private key, Alice can be sure only she and Bob can know the shared secret key. And thus, Alice can be sure that she is communicating with Bob. Of course, this is assuming Bob’s secret key is not compromised.

What happens when Bob’s secret key is compromised? You might think, “Well, if your key is compromised, your identity is stolen, and all bets are off, anything can happen”. While this is true, we want to make the impact of compromise as little as possible. So, what’s the impact here? First, of course, the attacker who steals Bob’s private key can pretend to be Bob in front of anyone because he can do whatever Bob can. But, additionally, he can also become anyone in front of Bob. He can pretend to be Alice in front of Bob. Because, remember, to establish a shared secret key between Alice and Bob, you need to either combine Alice’s public key with Bob’s private key or Bob’s public key with Alice’s private key. So, by knowing Bob’s private key, the attacker can calculate the combination between Alice’s public key and Bob’s private key and pretend to be Alice in front of Bob. This is called key-compromise impersonation.

To make things worse, once Bob’s private key is compromised, the attacker can calculate all the secret keys Bob has calculated for his communication. If, let’s say, Bob has been having a conversation with ten people and his private key is stolen, the attacker can now calculate the shared secret keys between Bob and these ten people. This is very bad. Imagine if the attacker is the server itself. All messages go through the server, and the server stores those messages. In a normal case, the server can’t read the messages because they are encrypted with the shared secret between the two parties. But, once Bob’s private key is compromised, the server can calculate the shared secret key between Bob and whoever is communicating with Bob and then decrypt all the messages between them.

What this means is that if Bob’s private key is compromised, all past and future communications are also compromised. The ability to protect past messages from future key compromises is called forward secrecy. Currently, we don’t have this property.

Note that, as long as the public and private keys are generated securely, it is practically impossible to crack the private key given only the public key. However, the attacker might steal our phone and get the keys stored in our phone. Even if we deleted all our message history, the attacker can still recover our messages if they know our private key. Remember, the attacker might be the server itself, which stores all of our messages.

Strategy 4: Two Diffie-Hellman

If you think about it, as long as we store our private key, the attacker can always steal our phone and get the key. To enable forward secrecy, DH key exchange typically uses an ephemeral key pair. The key is generated just before the communication, and deleted once the shared secret key is calculated. By doing this, if we delete all of our message history, the attacker can’t reconstruct them by stealing our phone. The attacker can’t recalculate the shared secret key because the private key we used has been deleted.

We can follow the same strategy by deleting our private key and creating a new one periodically. Unfortunately, this is very inconvenient. Our private key represents our identity; deleting our private key is like deleting our account. Deleting our private key means we can’t communicate with other people anymore. We need to tell all our friends that we have a new key.

What if, instead of having only one key pair, we have two? Now we have a key pair that represent our identity, which we hold forever, and another ephemeral key pair generated each time we want to communicate with someone.

$$ \begin{equation} \begin{split} IK_A &= \text{Alice's identity key pair} \\ IK_B &= \text{Bob's identity key pair} \\ EK_A &= \text{Alice's ephemeral key pair} \\ EK_B &= \text{Bob's ephemeral key pair} \end{split} \end{equation} $$

Now, the key exchange involves two DH runs. It goes like this:

Strategy 4

Function \(KDF\) in the equation above derives a new key based on its inputs. It stands for Key Derivation Function. How function \(KDF\) works is outside the scope of this post. For simplicity reason, you can think \(KDF\) as a function that concatenates all of its inputs.

There is one big problem with this approach: both parties have to be online during the key exchange process. To calculate \(DH_1\), both \(EK_A\) and \(EK_B\) are needed. Which means, when Alice wants to communicate with Bob, Alice needs Bob to generate an ephemeral key pair and send her the public key. In short, no asynchronous communication is supported by this strategy.

During key exchange, there is one initiator and one responder. The initiator is the one who wants to start the communication. Let’s say Alice wants to establish a secure connection with Bob. In that case, Alice is the initiator and Bob is the responder. Supporting asynchronous key exchange means we should assume that the responder might not be online. We can assume the initiator is always online.

To support asynchronous key exchange, we can use a key that is persistent but not necessarily long-lived. Now, the responder should have one more key pair called prekey. Like an identity key, a prekey comprises of public and a private key and can be used to run a DH key exchange. However, unlike an identity key, a prekey is short-lived. Every user has a prekey key pair, where the public key is stored in the server, and the private key is stored in the user’s device. The prekey will also be renewed periodically, let’s say every one week.

$$ \begin{equation} \begin{split} IK_A &= \text{Alice's identity key pair} \\ IK_B &= \text{Bob's identity key pair} \\ EK_A &= \text{Alice's ephemeral key pair} \\ EK_B &= \text{Bob's ephemeral key pair} \\ SPK_A &= \text{Alice's prekey key pair} \\ SPK_B &= \text{Bob's prekey key pair} \end{split} \end{equation} $$

Now, the key exchange is modified a little:

Strategy 4 Modified

Of course, the server has to store the \(EK_A\)’s public key, so that \(B\) can fetch it later once he is online.

We still have the authentication feature from the calculation of \(DH_2\), just like before. From Alice’s POV, only Bob can calculate \(DH_2\) since nobody has Alice’s private identity key and only Bob has both Alice’s public identity key and Bob’s private identity key. The server itself can’t calculate \(DH_2\) because it knows neither Alice’s nor Bob’s private identity key. The server can’t pretend to be Bob in front of Alice because Alice knows beforehand Bob’s public identity key.

The server can lie to Bob and give him a malicious ephemeral key that isn’t Alice’s. However, it won’t matter because the server can’t calculate \(DH_2\), which means it can’t calculate the shared secret key. So, Bob would think that he is communicating with Alice, but in reality, he is communicating with the server. However, the server can’t know what the messages are since the server can’t calculate the shared secret key.

We can actually make a better protocol to make Bob perfectly sure that he is communicating with Alice. The protocol is left as an exercise for the reader

What happens if the server steals all of our private keys? Remember, the server has all of our encrypted messages, and we should assume that the server is malicious. What happens if the server successfully steals both our identity and prekey keys? Just like strategy 3, the server can calculate all of our shared secret keys with our friends. The difference is, the server can only get the shared secret key for the past one week because we periodically generate a new prekey key pair every week.

This doesn’t mean you are safe. You still need to create a new identity key pair, which is almost like creating a new account. You also need to tell all of your friends about your new public identity key as well. But, unlike strategy 3, only shared secret keys that was generated in the past one week are compromised, depending on the expiration of the prekey.

What happens if the user’s phone is offline for more than one week and they can’t generate a new prekey key pair? What happens if the server refuses to take the renewed prekey? What happens if the initiator uses the responder’s old prekey? These questions are left as an exercise for the reader.

It is quite clear that if Bob’s private identity key is stolen by the server, the server can pretend to be Bob in front of anyone. To pretend as Bob in front of Alice, the server can calculate \(DH_2\) by using Bob’s private identity key combined with Alice’s public identity key. To calculate \(DH_1\), the server can just generate a new ephemeral key pair and use it with Alice’s prekey. Remember, Alice doesn’t know Bob’s ephemeral key at all. Ephemeral keys only exist the moment we want to perform the key exchange. Alice knows Bob’s ephemeral key by fetching it from the server. Normally, the server is honest. It will give Alice Bob’s ephemeral key because there is no reason for the server to lie. After all, the server can’t calculate \(DH_2\). But, if Bob’s private identity key is stolen by the server, the server can now calculate \(DH_2\), and it can pretend to be Bob by lying to Alice by giving him an ephemeral key that the server generates itself.

Additionally, we are still prone to a key-compromise impersonation attack. If Bob’s private identity key is stolen by the server, the server can also pretend to be anyone in front of Bob. Let me emphasize that stealing only the identity key itself (without also stealing the prekey) is enough. Let’s say the attacker wants to pretend to be Alice in front of Bob. Just as before, the attacker can calculate \(DH_2\) using Alice’s public identity key and Bob’s stolen private identity key. To calculate \(DH_1\), Bob uses his private prekey combined with Alice’s public ephemeral key. In turn, if you were Alice, you would use your private ephemeral key combined with Bob’s public prekey. It seems like the server can’t calculate \(DH_1\) because it knows neither Bob’s private prekey nor Alice’s private ephemeral key. However, Bob is relying on the server to give him Alice’s public ephemeral key. The server can lie to Bob. The server can generate its own ephemeral key pair and tell Bob that this is Alice’s. Bob can’t know that it is not Alice because, for all he knows, only he and Alice can calculate \(DH_2\). But, since Bob’s identity key is stolen, the server can calculate \(DH_2\) as well.

Strategy 5: Double Diffie-Hellman

Let’s fix our strategy to prevent a key-compromise impersonation attack in strategy 4.

Let’s assume that the server steals Bob’s private identity key. We want to create a protocol such that Bob can tell if the server is trying to pretend to be someone in front of him. Since the server steals Bob’s private identity key, any DH key exchange performed with this key is compromised by the server. There is nothing we can do about it.

Let’s take a look at Bob’s prekey. In strategy 4, the server can compromise \(DH_1\) because the server can lie about Alice’s ephemeral key. What if, instead of using Alice’s ephemeral key in \(DH_1\), we use Alice’s identity key, which is known by Bob? If we do this, the server can’t compromise \(DH_1\) since Bob doesn’t need the server to give him Alice’s identity key. Here is how it goes:

Strategy 5

This solves the above problem. Now, if Bob’s identity key is stolen by the server, the server can’t pretend to be Alice in front of Bob. The server can calculate \(DH_2\) because it knows Bob’s private identity key. However, it can’t calculate \(DH_1\) since it knows neither Bob’s private prekey nor Alice’s private identity key.

If you look closely, you’ll notice that now Alice’s ephemeral key is unused, which seems odd. You are correct, and we are losing something here. Remember, the reason why we have an ephemeral key in the first place is to have forward secrecy. Now that we don’t use Alice’s ephemeral key, stealing Alice’s identity key means the attacker can recalculate \(DH_1\) and \(DH_2\) and the shared secret key. Not only that, the attacker can recalculate the shared secret keys between Alice and anyone who interacted with Alice, since we only need Alice’s identity key to perform \(DH_1\) and \(DH_2\). So, in order to have forward secrecy like we have in strategy 4, we have to use Alice’s ephemeral key.

We can modify \(DH_2\) a little bit. Instead of combining Bob’s identity key with Alice’s identity key, we can combine Bob’s identity key with Alice’s ephemeral key. If the server steals Bob’s identity key, it doesn’t matter whether we run \(DH_2\) with Alice’s identity key or Alice’s ephemeral key; the server can still compromise it. So, we might as well run it with Alice’s ephemeral key to have forward secrecy. The overall protocol looks like this:

Strategy 5 Modified

We still have the authentication feature. Bob can authenticate Alice because only Alice can calculate \(DH_1\), and Bob knows Alice’s public identity key. Alice can authenticate Bob since only Bob can calculate \(DH_2\), and Alice knows Bob’s public identity key.

This strategy is also known as Double Diffie-Hellman.

Now, there is one little problem with this protocol. If both Alice’s and Bob’s identity keys are stolen by the server, the server can calculate the shared secret key. Since Alice’s identity key is stolen, the server can calculate \(DH_1\). And since Bob’s identity key is stolen, the server can calculate \(DH_2\). By knowing \(DH_1\) and \(DH_2\), the server can calculate the shared secret key.

Strategy 6: Triple Diffie-Hellman

Solving the above problem is quite simple, actually. We just need to have another DH between Bob’s prekey and Alice’s ephemeral key. The protocol goes like this:

Strategy 6

By having \(DH_3\), stealing both Alice’s and Bob’s private identity keys doesn’t break the protocol. The attacker can calculate \(DH_1\) and \(DH_2\), but not \(DH_3\).

This strategy is also known as Triple Diffie-Hellman.

Let’s think about how we can break this protocol. To break this, the attacker has to break \(DH_1\), \(DH_2\), and \(DH_3\). There are several options they can do:

  1. Steal both Bob’s identity key and prekey.
  2. Steal both Alice’s identity key and ephemeral key.
  3. Steal Bob’s prekey and Alice’s ephemeral key.

Options 2 and 3 are quite difficult since the private ephemeral key is deleted once the key exchange is performed by Alice. Remember, in this case, Alice is the initiator, so she can delete it right away, and the private ephemeral key is never stored in any persistent storage like SSD. In practice, the private ephemeral key only exists in the device memory for a very short amount of time, like in a few milliseconds. Even though the public ephemeral key is stored in the server, doing a brute-force attack to find the private ephemeral key is practically impossible.

Option 1 is more doable. Both Bob’s identity key and prekey are stored in Bob’s device persistently. This is needed because Bob needs to use it for the key exchange protocol. The attacker can steal Bob’s phone and dump its SSD to get the identity key and prekey. We have no perfect forward secrecy here. We’ve been trying to reduce the risk of this attack by making the prekey short-lived. Stealing Bob’s identity key and prekey can only compromise the shared secret keys that are generated for the past one week since Bob’s prekey is always renewed once a week.

Strategy 7: Extended Triple Diffie-Hellman (X3DH)

Let’s improve our forward secrecy a little bit more. In strategy 6, if someone steals your phone and extracts both your identity key and prekey, they will be able to reconstruct all the shared secret keys for the past one week, depending on the expiration time of your prekey. When this happens, you might know that someone is stealing your phone and can tell all of your friends that your phone is stolen and to not trust your account anymore. You can create a new account with a new identity key and prekey. By doing this, the attacker can’t pretend to be you in front of your friend. However, all of your shared secret keys for the past one week are exposed to the attacker.

We can reduce the impact a little bit by adding one more DH key exchange and a new type of key. Notice that, in order to have good forward secrecy, you should delete the private key you use for DH key exchange. Once you delete your private key, not only the attacker can’t calculate the shared secret key, but you yourself can’t even do it. This is why we have an ephemeral key and delete the private ephemeral key once \(DH_2\) and \(DH_3\) are calculated. The problem is that \(DH_2\) can be calculated using the responder’s private identity key, and \(DH_3\) can be calculated using the responder’s private prekey, which are not ephemeral. Well, let’s introduce a new key on the responder’s side that is also deleted once the responder calculates the shared secret key.

We can add one more type of key called one-time prekeys. A user can have multiple one-time prekeys. Let’s say people have ten one-time prekeys on average. Just like the prekey, the user stores the private one-time prekeys in their device and puts their public one-time prekeys on the server. We can add one more DH key exchange between the initiator’s ephemeral key and one of the responder’s one-time prekeys. Once the responder calculates the shared secret key, the responder deletes the private one-time prekey. By doing this, if the responder’s phone is stolen, the attacker can’t calculate the shared secret key because the private one-time prekey that was used to calculate that shared secret key is already deleted.

$$ \begin{equation} \begin{split} IK_A &= \text{Alice's identity key pair} \\ IK_B &= \text{Bob's identity key pair} \\ EK_A &= \text{Alice's ephemeral key pair} \\ EK_B &= \text{Bob's ephemeral key pair} \\ SPK_A &= \text{Alice's prekey key pair} \\ SPK_B &= \text{Bob's prekey key pair} \\ OPK_A^i &= \text{Alice's i-th one-time prekey} \\ OPK_B^i &= \text{Bob's i-th one-time prekey} \\ \end{split} \end{equation} $$

Overall, the protocol will look like this:

Strategy 7

Let me give you an example. Let’s say Alice wants to initiate communication with Bob. Before, Bob already had a bunch of keys. In his device, Bob has a private identity key, a private prekey, and ten private one-time prekeys. The server also stores Bob’s public prekey and ten of Bob’s public one-time prekeys. Bob is currently offline, and Alice is trying to initiate communication with Bob.

To initiate communication with Bob, Alice generates an ephemeral key pair and fetches Bob’s information from the server. Then, Alice chooses one of Bob’s one-time prekeys. From this, Alice can calculate \(DH_1\), \(DH_2\), \(DH_3\), and \(DH_4\). Since Alice knows Bob’s public identity key beforehand, Alice can verify if the server gives him the correct data (or at least the data that can’t be misused by the server to sniff their messages). After calculating \(DH_1\), \(DH_2\), \(DH_3\), and \(DH_4\), Alice can delete her private ephemeral key and calculate the shared secret key. Since Alice’s ephemeral key is deleted, it is impractical to steal the shared secret key from Alice’s device since the ephemeral key is already deleted from her device.

Once online, Bob will fetch all the connection requests from the server and receive Alice’s initialization message containing Alice’s public ephemeral key and Bob’s one-time prekey that was chosen by Alice. From this information, Bob will calculate the shared secret key using the combination of Alice’s public identity key, Alice’s public ephemeral key, Bob’s private identity key, Bob’s private prekey, and Bob’s chosen private one-time prekey. Once done, Bob will delete the chosen private one-time prekey from his device. Since the private one-time prekey used for calculating the shared secret key is deleted, it is impractical to steal the shared secret key by stealing Bob’s phone because the private one-time prekey is already deleted.

This strategy is the one known as X3DH (Extended Triple Diffie-Hellman). This is the strategy that is used by Signal for its key exchange.

There is one small problem, though. If Bob is offline when Alice is trying to connect with him and Bob’s device is stolen before Bob has the chance to delete the one-time prekey, the attacker can calculate the shared secret.

You might already have noticed that this strategy might not always be applicable. What happens if Bob only has ten one-time prekeys but twenty people are trying to communicate with him? In the Signal’s paper, when this happens, we will fallback to strategy 6 (Triple Diffie-Hellman) without the one-time prekey. This means we have to put trust in the server. The server can perform an attack on us by refusing to tell the initiator the responder’s one-time prekeys, making the key exchange always fall back to strategy 6. Remember, the responder might be offline during the key exchange, so the responder can’t really tell the initiator that the server is lying. When the initiator initiates key exchange, and the server says the responder doesn’t have any one-time prekey, the initiator has no choice but to believe the server.

Additionally, anyone can perform an attack on the responder by bombarding the responder with key exchange requests, which drains the responder’s one-time prekeys, and forcing the responder to always fallback to strategy 6. This can be mitigated by the server using some kind of rate limiter. But again, this requires us to have trust in the server.

Where To Go From Here

This post aims to explain the intuition behind why X3DH is designed that way, what it tries to protect, and why the protocol is complex. There are some details that I skipped, some ideas that were oversimplified, and some other attacks that might be done by an attacker. To name a few, I don’t really touch how the authentication works in practice, how the shared secret key is used, protection against replay attack, and the deniability property. You can check Signal’s X3DH paper to get more details.