This document will eventually become an XEP.
This document defines what issues need to be addressed for OTRv4 to work correctly over XMPP.
First an overview of how other protocols approach the problem:
From its security audit (including some of our comments):
"Assume Alice wants to send an OMEMO encrypted message from her phone. She can
detect that Bob’s device(s) support OMEMO by requesting his device list with
PEP. If he does, she encrypts and authenticates her message using a randomly
generated key (sym_k
). For every device that Alice wants to send the encrypted
message to, she fetches the entire bundle via PEP (sic: this means that a Key
Agreement is done per each device: how is the long-term secret key shared?). If
she wants to add more of her own devices in the conversation, she gets their
bundles as well from her own server. Alice creates a PreKeySignalMessage for
every device by picking a random one-time prekey from each bundle and
encrypting the randomly generated key to each device. She combines all
information in a single MessageElement: the encrypted payload (), the
plaintext iv (), the sender id (sid) and the encrypted
random key () tagged with the corresponding receiver id (rid)"
The process works like this:
- Generate a random
sym_k
. - Calculate:
enc_key(32), auth_key(32), IV(16) := SHA-256(sym_k || 0x00 || "OMEMO Payload")
- Encrypt:
c := AES_CBC(enc_key, IV || message)
- Calculate:
MAC := SHA-256(auth_key || c)
- Concatenate:
payload := enc_key || MAC
- Execute the double ratchet algorithm and generate a message key
mk
. - Calculate:
h_enc_key(32), auth_key(32), IV(16) := SHA-256(m_k || 0x00 || "OMEMO Message Key Material")
- Encrypt the payload:
h := AES_CBC(h_enc_key, payload)
- Send
h || c
.
Since step 6, it is executed per device.
This is similar of how Signal works for group messaging.
In the wire protocol, it is allowed to have up to 8 devices (7 permanent, 1 that can be used as a temporary account).
From the Wire security paper:
"To send an encrypted message the sending client needs to have a cryptographic session with every client it wants to send the message to (usually all clients of all participants of a particular conversation). It will encrypt the plain text message for every recipient and send the batch to the server. The server checks if every client of every user who is a participant of the conversation is part of the batch. If a client is missing, the server will reject the request and inform the sender of missing clients. The sender can then fetch prekeys for the missing clients and prepare the remaining messages before attempting to resend the entire batch."
Multidevice is achieved by sharing the private part of the identity key through devices, as defined by Moxie.
Signal has also introduced the Sesame protocol for the handling of devices; but it is unclear if Sesame is currently used.
OTR used instance tags, which uniquely identify a device.
"Clients include instance tags in all OTR version 3 messages. Instance tags are 32-bit values that are intended to be persistent. If the same client is logged into the same account from multiple locations, the intention is that the client will have different instance tags at each location. As shown below, OTR version 3 messages (fragmented and unfragmented) include the source and destination instance tags. If a client receives a message that lists a destination instance tag different from its own, the client should discard the message."
If a user had multiple OTRv3 sessions with the same buddy, the application needed to provide some way for the user to select which instance to send outgoing messages to.
Instance tags in the 3rd version of the protocol had policies, which the user or the client per default could define:
OTRL_INSTAG_BEST: send to the most secure one, based on: conv status (if the conversation is on encrypted, or plaintext and finished state), then fingerprint status (if it is trusted), then most recent. OTRL_INSTAG_RECENT: send to the most recent of the two meta instances below OTRL_INSTAG_RECENT_RECEIVED: send to the most recently received OTRL_INSTAG_RECENT_SENT: send to the most recently sent
OTRL_INSTAG_BEST choses the instance that has the best conv status, then fingerprint status (in the event of a tie), then most recent (similarly in the event of a tie). When calculating how recent an instance has been active, OTRL_INSTAG_BEST is limited by a one second resolution.
OTRL_INSTAG_RECENT does not have this limitation, but due to inherent uncertainty in some networks, OTR's notion of the most recent may not always agree with the remote network. It is important to understand this limitation because instances do add uncertainty when dealing with networks that only deliver messages to the most recently active session for a buddy who is logged in multiple times. If you have a particular instance selected, and the IM network is simply not going to deliver to that particular instance, there isn't too much OTR can do. In this case, you may want your application to warn when a user has selected an instance that is not the most recent.
OTR in its version 4 will retain all previous instance tag policies, with the same behaviour:
- OTRL_INSTAG_BEST
- OTRL_INSTAG_RECENT
- OTRL_INSTAG_RECENT_RECEIVED
- OTRL_INSTAG_RECENT_SENT
It will also add a new type of instance tag policy:
- OTRL_INSTAG_SYNCHRONIZE
The application implementing OTRv4 has to keep track of the devices a user has, if this policy is implemented. A maximum of 8 devices are allowed. Every device will keep track of their own key material (long-term and ephemeral), client and prekey profile. It is not recommended to share key material between devices.
The following paragraphs describe the actions needed to be taken into account for when starting conversations. The status of the conversation (online or offline) is always defined prior to sending any OTR message.
The following procedure works for online conversations that are not started with an identity message directly:
Bob (using a mobile device), who wants to communicate with Alice, will start by sending her a query message or whitespace tag. Upon receipt, Alice will:
- If the initiation message contains the tag for v3, and Alice receiving device only supports v3, the protocol will continue in v3.
- If the initiation message contains the tag for v4, and Alice receiving device
supports v4:
- Alice will request to the underlying protocol (XMPP), a list of the devices that Bob supports, and the list of devices that she supports. The list will only contain devices that support version 4.
- Alice will request to see if Bob is online or offline in those devices.
- Alice will request to see if Alice's other devices are online or offline.
- Depending of the online or offline status, Alice will either begin an online or offline DAKE with each one of them (with each device from Alice, and with each device from Bob). This means that each device will have its own key material.
- The application will send the messages to the specific device depending on the unique instance tag.
- Bob will receive all messages in all the devices she supports. Alice will receive all messages in all the devices he supports. He will answer back by performing the same mechanism.
The following procedure works for online conversations that are started with an identity message directly:
Bob (using a mobile device), who wants to communicate with Alice, will:
- Bob will request to the underlying protocol (XMPP), a list of the devices that Alice supports, and the list of devices that he supports. The list will only contain devices that support version 4.
- Bob will request to see if Alice is online or offline in those devices.
- Bob will request to see if Bob's other devices are online or offline.
- Depending of the online or offline status, Bob will either begin an online or offline DAKE with each one of them (with each device from Alice, and with each device from Bob). This means that each device will have its own key material.
- The application will send the messages to the specific device depending on the unique instance tag.
- Alice will receive all messages in all the devices she supports. Bob will receive all messages in all the devices he supports. She will answer back by performing the same mechanism.
The following procedure works for offline conversations:
Bob (using a mobile device), who wants to communicate with Alice, will:
- Bob will request to the underlying protocol (XMPP), a list of the devices that Alice supports, and the list of devices that he supports. The list will only contain devices that support version 4.
- Bob will request to see if Alice is online or offline in those devices.
- Bob will request to see if Bob's other devices are online or offline.
- Depending of the online or offline status, Bob will either begin an online or offline DAKE with each one of them (with each device from Alice, and with each device from Bob). This means that each device will have its own key material.
- The application will send the messages to the specific device depending on the unique instance tag.
- Alice will receive all messages in all the devices she supports. Bob will receive all messages in all the devices he supports. She will answer back by performing the same mechanism.
To note
- If a device advertises that they only use a certain mode (for example, the 'OTRv4-interactive-only' mode), then any received messaged will be handled according to the mode (if in a 'OTRv4-interactive-only' mode, an offline message is received, for example, the message should be discarded).
- A session is destroyed (with each conversation with every device) once a TLV Type 1 Disconnected is sent.
- Instance tags are generated prior to the start of any conversation
- For verification of long-term key material, Trust-on-first-use (TOFU) can be used.
For XMPP, OTRv4 will need:
- A dedicated Prekey Server where key material to start an offline conversation will be stored. This server will also store Client and Prekey profiles.
- The XEP-0163: Personal Eventing Protocol for discovering the devices of the other party and their own.
- The XEP-0060: Publish-Subscribe for announcing the devices one supports. This list should not contain more than 8 entries.
- Disallow carbons.
- Race conditions
- Messages to Alice arriving earlier than to Bob own devices
- Malicious devices
- Linked devices
- Adding/removing devices
- Collisions of instance tags
- Handling the TLV type 1
- Changing from main sending device
- Private Group Messaging
- OMEMO: cryptographic analysis report
- XEP-0384: OMEMO Encryption
- Wire github issue
- Key verification to secure your conversations
- Wire Security Whitepaper
- Signal multidevice
- The Sesame Algorithm: Session Management for Asynchronous Message Encryption
- Attack of the Week: Group Messaging in WhatsApp and Signal
OTR in its version 4 needs an untrusted prekey server to publish key material needed for offline conversations.
We will look at how other protocols handle this.
OMEMO uses Publish-Subscribe (XEP-0060) to publish key material. This is not usable for OTRv4 as we need specific functionality from the server.
It also encourages to, in the future, use a dedicated server: "While in the future a dedicated key server component could be used to distribute key material for session creation"
Wire uses a dedicated centralized server. You can self-host it, as federation is on the long road map of Wire. The source code is here, which is useful to take a look when implementing.
One major issue of their approach is that "the Wire client authenticates with a central server in order to provide user presence information (Wire does not attempt to hide metadata, other than the central server promising not to log very much information).
Signal uses a dedicated centralized server. You can self-host it; but it does not seem like it will become federated. The servers are also used for discovery of contacts. The source code can be found here.
We will use an untrusted decentralized dedicated server. This is due to the fact that we need certain functionality:
- Communication with the server is encrypted with the same OTRv4 mechanism
- Servers have fingerprints used for verification
We will follow what has been defined here.
- XEP-0060: Publish-Subscribe
- XEP-0163: Personal Eventing Protocol
- Open sourcing Wire server code
- Wire
- Secure Messaging App Wire Stores Everyone You've Ever Contacted in Plain Text
- Reflections: The ecosystem is moving
OTR in its version 4 uses long-term key material that should be handled with care.
It is discouraged to import/export long-term key material, as each device will have its own. The only moment were it can be allowed is when a device will be destroyed, and the user needs to set up their account on a new device. Nevertheless, is encouraged that new key material is generated when changing to a new device.
In order to use OTR in its version 4, we need a way to discover that a contact is online or offline.
Users of OTRv4 will need to have subscription to the contact's presence information. The best subscription state for this might be 'Both'.
When a user is online, an online DAKE will start. When a user is offline, unavailable or invisible, an offline DAKE will start.
- XEP-0186: Invisible Command
- Mapping the Extensible Messaging and Presence Protocol (XMPP) to Common Presence and Instant Messaging
- Extensible Messaging and Presence Protocol (XMPP): Instant Messaging and Presence
Still needs to be defined.
This will not be covered by OTR version 4.
- Processing Hints
- Explicit Message Encryption