Jingle Encrypted Transports (JET) strives to provide a modular and easily extensible way to wrap Jingle Transports in an additional end-to-end encryption layer. The focus of this specification lays on being modular. It should be possible to extend existing Jingle use scenarios with end-to-end encryption by simply adding a JET element to the negotiation.
JET uses multiple encryption layers, so it is necessary to declare a distinct denomination for the different keys involved.
|Transport Key||TK||(Symmetric) key that is used to encrypt/decrypt the bytestreams sent/received through Jingle transports. This key encrypts the data two entities want to exchange. Examples for TK can be found under "Ciphers".|
|Initialization Vector||IV||Initialization vector that is used together with TK.|
|Transport Secret||TS||Serialization of TK and TI.|
|Envelope Element||EE||Output element of an established end-to-end encryption method when encrypting TS.|
Lets assume Romeo wants to initiate an encrypted Jingle session with Juliet. Prior to the Jingle session initiation, an already existing, established and (ideally) authenticated end-to-end encryption session between Romeo and Juliet MUST exist. This session is needed to transfer the Transport Secret from Romeo to Juliet.
When this precondition is met, Romeo initially generates a transport key (TK) and associated initialization vector (IV). These will later be used by the sender to encrypt, and respectively by the recipient to decrypt data that is exchanged. This protocol defines a set of usable ciphers from which Romeo might choose. TK and IV are serialized to create the transport secret (TS).
Next Romeo uses her established encryption session with Juliet to encrypt TS. The resulting envelope element (EE) will be part of the Jingle session initiation as child of the JET <secret/> element.
When Juliet receives Romeos session request, she decrypts EE to retrieve TS, from which she can deserialize TK and IV. Now she and Romeo can go on with the session negotiation. Once the session is established, data can be encrypted and exchanged. Both parties MUST keep a copy of TS in cache until the Jingle session is ended.
Jingle File Transfer (XEP-0234)  has the disadvantage, that transmitted files are not encrypted (aside from regular TLS transport encryption), which means that intermediate nodes like XMPP/proxy server(s) have access to the transferred data. Considering that end-to-end encryption becomes more and more important to protect free speech and personal expression, this is a major flaw that needs to be addressed.
In order to initiate an encrypted file transfer, the initiator includes a JET <secret/> in the Jingle file transfer request.
In this scenario Romeo wants to send an encrypted text file over to Juliet. First, he generates a fresh AES-256 transport key and IV. TK and IV are serialized into TS which is then encrypted using Romeos end-to-end-encryption session with Juliet.
The resulting envelope element (EE) is sent as part of the security element along with the rest of the jingle stanza over to Juliet.
Juliet decrypts the envelope element (EE) using her session with Romeo to retrieve TS from which she deserializes TK and IV. Both Juliet and Romeo then carry on with the session negotiation as described in Jingle File Transfer (XEP-0234) . Before Romeo starts transmitting the file, he encrypts it using TK and IV. He then transmitts the encrypted file over to Juliet.
When Juliet received the file, she uses the TK and IV to decrypt the received file.
Juliet might want to request a file transfer from Romeo. This can be the case, when Romeo hosts the file. In order to do so, she sends generates TK and IV, creates TS from those and encrypts TS with an encryption method of her choice to get EE. TK and IV will be used by Romeo to encrypt the requested file before sending it to Juliet.
Jingle File Transfer (XEP-0234)  defines a way for parties to request ranged transfers. This can be used to resume interrupted transfers etc. In case of an interrupted transfer, the receiving party might be able to decrypt parts of the received file. When requesting a resumption of the transfer, the recipient therefore can use the index of the last successfully decrypted byte of the file as offset in the ranged transfer. Since a resumed transfer takes place in a new session, the old transport secret might no longer be available to either party. For that reason the receiver creates a new TS for the session-initiation. The sending party then encrypts and sends only the requested parts of the file.
In order to encrypt the transported bytestream, the initiator must transmit a cipher key to the responder. There are multiple options available:
The column 'serialization' describes, how the key and iv are serialized. "::" means plain concatenation of byte arrays.
To advertise its support for the Jingle Encrypted Transports, when replying to service discovery information ("disco#info") requests an entity MUST return URNs for any version, or extension of this protocol that the entity supports -- e.g., "urn:xmpp:jingle:jet:0" for this version, or "urn:xmpp:jingle:jet-stub:0" for a stub encryption method (see Namespace Versioning regarding the possibility of incrementing the version number).
In order for an application to determine whether an entity supports this protocol, where possible it SHOULD use the dynamic, presence-based profile of service discovery defined in Entity Capabilities (XEP-0115) . However, if an application has not received entity capabilities information from an entity, it SHOULD use explicit service discovery instead.
The initiator SHOULD NOT use the generated key TK as IV, but instead generate a seperate random IV.
Instead of falling back to unencrypted transfer in case something goes wrong, implementations MUST instead abort the Jingle session, informing the user.
IMPORTANT: This approach does not deal with metadata. In case of Jingle File Transfer (XEP-0234) , an attacker with access to the sent stanzas can for example still see the name of the file and other information included in the <file/> element.
The responder MUST check, whether the envelope element belongs to the initiator to prevent MitM attacks
This is only a rough draft and there is still a ton of questions left to be answered. Here is a small non-exhaustive list of things I can think of:
This document in other formats: XML PDF
This XMPP Extension Protocol is copyright © 1999 – 2019 by the XMPP Standards Foundation (XSF).
Permission is hereby granted, free of charge, to any person obtaining a copy of this specification (the "Specification"), to make use of the Specification without restriction, including without limitation the rights to implement the Specification in a software program, deploy the Specification in a network service, and copy, modify, merge, publish, translate, distribute, sublicense, or sell copies of the Specification, and to permit persons to whom the Specification is furnished to do so, subject to the condition that the foregoing copyright notice and this permission notice shall be included in all copies or substantial portions of the Specification. Unless separate permission is granted, modified works that are redistributed shall not contain misleading information regarding the authors, title, number, or publisher of the Specification, and shall not claim endorsement of the modified works by the authors, any organization or project to which the authors belong, or the XMPP Standards Foundation.
## NOTE WELL: This Specification is provided on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. ##
In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall the XMPP Standards Foundation or any author of this Specification be liable for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising from, out of, or in connection with the Specification or the implementation, deployment, or other use of the Specification (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if the XMPP Standards Foundation or such author has been advised of the possibility of such damages.
This XMPP Extension Protocol has been contributed in full conformance with the XSF's Intellectual Property Rights Policy (a copy of which can be found at <https://xmpp.org/about/xsf/ipr-policy> or obtained by writing to XMPP Standards Foundation, P.O. Box 787, Parker, CO 80134 USA).
The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 6120) and XMPP IM (RFC 6121) specifications contributed by the XMPP Standards Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this document has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.
There exists a special venue for discussion related to the technology described in this document: the <email@example.com> mailing list.
The primary venue for discussion of XMPP Extension Protocols is the <firstname.lastname@example.org> discussion list.
Discussion on other xmpp.org discussion lists might also be appropriate; see <http://xmpp.org/about/discuss.shtml> for a complete list.
Errata can be sent to <email@example.com>.
The following requirements keywords as used in this document are to be interpreted as described in RFC 2119: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".
Note: Older versions of this specification might be available at http://xmpp.org/extensions/attic/