Internet-Draft | MIMI Identity | July 2023 |
Mahy | Expires 11 January 2024 | [Page] |
This document discusses concepts in instant messaging identity interoperability when using end-to-end encryption, for example with the MLS (Message Layer Security) Protocol. The goal is to explore the problem space in preparation for framework and requirements documents.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 11 January 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The IETF began standardization work on interoperable Instant Messaging in the late 1990s, but since that period, the typical feature set of these systems has expanded widely and was largely driven by the industry without much standardization or interoperability. The MIMI (More Instant Messaging Interop) problem outline [I-D.mahy-mimi-problem-outline] identifies areas where more work is needed to build interoperable IM systems.¶
The largest and most widely deployed Instant Messaging (IM) systems support end-to-end message encryption using a variant of the Double Ratchet protocol [DoubleRatchet] popularized by Signal and the companion X3DH [X3DH] key agreement protocol. Many vendors are also keen to support the Message Layer Security (MLS) protocol [I-D.ietf-mls-protocol] and architecture [I-D.ietf-mls-architecture]. These protocols provide confidentiality of sessions (with Double Ratchet) and groups (with MLS) once the participants in a conversation have been identified. However, the current state of most systems require the end user to manually verify key fingerprints or blindly trust their instant messaging service not to add and remove participants from their conversations. This problem is exacerbated when these systems federate or try to interoperate.¶
While some single vendor solutions exist, clearly an interoperable mechanism for IM identity is needed. This document builds on the roles described in [I-D.barnes-mimi-identity-arch]. First this document attempts to articulate a clear description and semantics of different identifiers used in IM systems. Next the document provides an example of how to represent those identifiers in a common way. Then the document discusses different trust approaches. Finally the document surveys various cryptographic methods of making and verifying assertions about these identifiers.¶
Arguably, as with email, the success of XMPP [RFC6120] was partially due to the ease of communicating among XMPP users in different domains with different XMPP servers, and a single standardized address format for all XMPP users.¶
The goal of this document is to explore the problem space, so that the IETF community can write a consensus requirements document and framework.¶
IM systems have a number of types of identifiers. Few (or perhaps no) systems use every type of identifier described here. Not every configuration of the same application necessarily use the same list of identifiers.¶
example.com
or im.example.com
. Many proprietary IM systems operate in a single
domain and have no concept of domains or federation.¶
@alice_smith
could
become @alice_jones
or @alex_smith
after change of marital status or
gender transition.¶
Protocol | Identifier Address | Example |
---|---|---|
Jabber/XMPP | Bare JID |
[email protected]
|
SIP | Address of Record (AOR) |
sip:[email protected]
|
IRC | nick |
@juliet
|
Generic example | "unscoped handle" |
@juliet
|
Generic example | "scoped handle" |
@[email protected]
|
Email style | Mailbox address |
[email protected]
|
[email protected]
for the "nick"
@juliet
).¶
Protocol | Identifier Address | Example |
---|---|---|
Jabber/XMPP | Fully-qualified JID |
juliet/[email protected]
|
SIP | Contact Address |
sip:juliet@[2001:db8::225:96ff:fe12:3456]
|
Wire | Qualified client ID |
0fd3e0dc-a2ff-4965-8873-509f0af0a75c/[email protected]
|
group_id
. The Wire protocol uses the term
qualified conversation ID
to refer to a group internally across domains.
Among implementations of the Double Ratchet family of protocols a unidirectional
sequence of messages from one client to another is referred to as a session, and
often has an associated session identifier.¶
One user often has multiple clients (for example a mobile and a desktop client). A handle usually refers to a single user or rarely it may redirect to multiple users. In some systems, the user identifier is a handle. In other systems the user identifier is an internal representation, for example a UUID. Handles may be changed/renamed, but hopefully internal user identifiers do not. Likewise, group conversation identifiers could be internal or external representations, whereas group names or channel names are often external friendly representations.¶
It is easy to imagine a loose hierarchy between these identifiers (domain to user to device), but hard to agree on a specific fixed structure. In some systems, the group chat or session itself has a position in the hierarchy underneath the domain, the user, or the device.¶
As described in the next section,
the author proposes using URIs as a container for interoperable IM identifiers.
All the examples use
the im:
URI scheme (defined in [RFC3862]), but any instant messaging scheme
should be acceptable as long as the comparison and validation rules are clear.¶
Most if not all of the identifiers described in the previous section could be
represented as URIs. While individual instant messaging protocol-specific URI
schemes may not have been specified with this use of URIs in mind, the im:
URI scheme should be flexible enough to represent all of or any needed subset of the
previously discussed identifiers.¶
For example, the XMPP protocol can represent a domain, a handle (bare JID),
or a device (fully qualified JID).
Unfortunately its xmpp: URI scheme was only designed to represent handles and domains,
but the im:
URI scheme can represent all XMPP identifiers:¶
Likewise the IRC protocol can represent domain, handle (nick), user (account), and channel. The examples below represent a domain, a nick, a user, a local channel, abd three ways to specify the projectX channel.¶
Imagine a hypothetical WXYZ IM protocol with support for all our identifiers.
These could be represented unambiguously using the conventions below, or with an
explicit parameter (ex: ;id-type=
):¶
id type | unscoped form | domain scoped form |
---|---|---|
domain | - | example.com |
handle | @alice | @[email protected] |
user | BFuVxW5BfJc8R7Qw | [email protected] |
device | BFuVxW5BfJc8R7Qw/072b | BFuVxW5BfJc8R7Qw/[email protected] |
channel | #projectX | #[email protected] |
team | ##engineering | ##[email protected] |
channel | ##engineering/projX | ##engineering/[email protected] |
group id | $TII9t5viBrXiXc | [email protected] |
Now imagine that WXYZ reserved the wxyz: URI scheme. The example below shows how
almost any reasonable protocol-specific identifier scheme can be represented as an im:
URI.¶
Note that if there is no domain, an im:
URI, or another scheme, could use
local.invalid
in place of a resolvable domain name.¶
im:wxyz=%[email protected]¶
Different IM applications and different users of these applications may have different trust needs. The following subsections describe three specific trust models for example purposes. Note that the descriptions in this section use certificates in their examples, but nothing in this section should preclude using a different technology which provides similar assertions.¶
In this environment, end-user devices trust a centralized authority operating on behalf of their domain (for example, a Certificate Authority), that is trusted by all the other clients in that domain (and can be trusted by federated domains). The centralized authority could easily be associated with a traditional Identity Provider (IdP). This is a popular trust model for companies running services for their own employees and contractors. This is also popular with governments providing services to their employees and contractors or to residents or citizens for whom they provide services.¶
For example XYZ Corporation could make an assertion that "I represent XYZ Corporation and this user demonstrated she is Alice Smith of the Engineering department of XYZ Corporation."¶
In this model, a Certificate Authority (CA) run by or on behalf of the domain generates certificates for one or more of the identifier types described previously. The specifics of the assertions are very important for interoperability. Even within this centralized credential hierarchy model, there are at least three ways to make assertions about different types of IM identifiers with certificates:¶
What is important in all these examples is that other clients involved in a session or group chat can validate the relevant credentials of the other participants in the session or group chat. Clients would need to be able to configure the relevant trust roots and walk any hierarchy unambiguously.¶
When using certificates, this could include associating an Issuer URI in the issuerAltName with one of the URIs in the subjectAltName of another cert. Other mechanisms have analogous concepts.¶
Regardless of the specific implementation, this model features a strong hierarchy.¶
The advantage of this approach is to take advantage of a strong hierarchy which is already in use at an organization, especially if the organization is using an Identity Provider (IdP) for most of its services. Even if the IM system is compromised, the presence of client without the correct end-to-end identity would be detected immediately.¶
The disadvantage of this approach is that if the CA colludes with a malicious IM system or both are compromised, an attacker or malicious IM system can easily insert a rogue client which would be as trusted as a legitimate client.¶
In some communities, it may be appropriate to make assertions about IM identity by relying on a web of trust. The following specific example of this general method is used by the OMEMO community presented by [Schaub] and proposed in [Matrix1756]. This document does not take any position on the specifics of the proposal, but uses it to illustrate a concrete implementation of a web of trust involving IM identifiers.¶
The example uses a web of trust with cross signing as follows:¶
The advantage of this approach is that if Alice's and Bob's keys, implementations, and devices are not compromised, there is no way the infrastructure can forge a key for Alice or Bob and insert an eavesdropper or active attacker. The disadvantages of this approach are that this requires Alice's device-signing key to be available any time Alice wants to add a new device, and Alice's user-signing key to be available anytime she wants to add a new user to her web of trust. This could either make those operations inconvenient and/or unnecessarily expose either or both of those keys.¶
A detailed architecture for Web of Trust key infrastructure which is not specific to Instant Messaging systems is the Mathematical Mesh [I-D.hallambaker-mesh-architecture].¶
In this trust model, a user with several services places a cross signature for all their services at a well known location on each of those services (for example a personal web site .well-known page, an IM profile, the profile page on an open source code repository, a social media About page, a picture sharing service profile page, a professional interpersonal-networking site contact page, and a dating application profile). This concept was perhaps first implemented for non-technical users by Keybase. The user of this scheme likely expects that at any given moment there is a risk that one of these services is compromised or controlled by a malicious entity, but expects the likelihood of all or most of their services being compromised simultaneously is very low.¶
The advantage of this approach is that it does not rely on anyone but the user herself. This disadvantage is that if an attacker is able to delete or forge cross signatures on a substantial number of the services, the forged assertions would looks as legitimate as the authentic assertions (or more convincing).¶
These different trust approaches could be combined, however the verification rules become more complicated. Among other problems, implementers need to decide what happens if two different trust methods come to incompatible conclusions. For example, what should the application do if web of trust certificates indicate that a client or user should be trusted, but a centralized hierarchy indicates a client should not be, or vice versa.¶
X.509 certificates are a mature technology for making assertions about identifiers. The supported assertions and identifier formats used in certificates are somewhat archaic, inflexible, and pedantic, but well understood. The semantics are always that an Issuer asserts that a Subject has control of a specific public key key pair. A handful of additional attributes can be added as X.509 certificate extensions, although adding new extensions is laborious and time consuming. In practice new extensions are only added to facilitate the internals of managing the lifetime, validity, and applicability of certificates. X.509 extensions are not appropriate for arbitrary assertions or claims about the Subject.¶
The Subject field contains a Distinguished Name, whose Common Name (CN) field can contain free form text. The subjectAltName can contain multiple other identifiers for the Subject with types such as a URI, email address, DNS domain names, or Distinguished Name. The rules about which combinations of extensions are valid are defined in the Internet certificate profile described in [RFC5280]. As noted in a previous section of this document, URIs are a natural container for holding instant messaging identifiers. Implementations need to be careful to insure that the correct semantics are applied to a URI, as they may be referring to different objects (ex: a handle versus a client identifier). There is a corresponding issuerAltName field as well.¶
Certificates are already supported in MLS as a standard credential type which can
be included in MLS LeafNodes and KeyPackages.
[In the X3DH key agreement protocol (used with Double Ratchet), the first message
in a session between a pair of clients can contain an optional
certificate, but this is not standardized.]
Arguably the biggest drawback to using X.509 certificates is that administratively
it can be difficult to obtain certificates for entities that can also generate
certificates---specifically to issue a certificate with the standard extension
basicContraints=CA:TRUE
.]¶
If implementing cascading certificates, the Issuer might be a expressed as a URI in the issuerAltName extension.¶
JSON Web Signing (JWS) [RFC7515] and JSON Web Tokens (JWT) [RFC7519] are toolkits for making a variety of cryptographic claims. (CBOR Web Tokens [RFC8392] are semantically equivalent to JSON Web Tokens.) JWT is an appealing option for carrying IM identifiers and assertions, as the container type is flexible and the format is easy to implement. Unfortunately the semantics for validating identifiers are not as rigorously specified as for certificates at the time of this writing, and require additional specification work.¶
The JWT Demonstrating Proof of Possession (DPoP) specification [I-D.ietf-oauth-dpop]
adds the ability
to make claims which involve proof of possession of a (typically private) key, and
to share those claims with third parties. The owner of a the key generates a proof
which is used to fetch an access token
which can then be verified by a third party.
JWT DPoP was actually created as an improvement over Bearer tokens used for
authentication, so its use as a certificate-like assertion may require substantial
clarification and possibly additional profile work.¶
While there is support for token introspection, in general access tokens need online verification between resources and the token issuer.¶
While JWTs can include list of arbitrary claims, there is no native support for multiple subjects in the same JWT. There is a proposal to address this limitation with nested JWTs [I-D.yusef-oauth-nested-jwt].¶
Verifiable Credentials (VC) is a framework for exchanging machine-readable credentials [W3C.REC-vc-data-model-20191119]. The framework is well specified and has a very flexbile assertion structure, which in addition to or in place of basic names and identifiers, can optionally include arbitrary attributes (ex: security clearance, age, nationality) up to and including selective disclosure depending on the profile being used. For example, a verifiable credential could be used to assert that an IM client belongs to a Customer Support agent of Sirius Cybernetic Corp, who speaks English and Vogon, and is qualified to give support for their Ident-I-Eeze product, without revealing the name of the agent.¶
The VC specification describes both Verifiable Credentials and Verifiable Presentations. A Verifiable Credential contains assertions made by an issuer. Holders assemble credentials into a Verifiable Presentation. Verifiers can validate the Verifiable Credentials in the Verifiable Presentation. Specific credential types are defined by referencing ontologies. The example at the end of this section uses the VCard ontology [W3C.WD-vcard-rdf-20130924].¶
Most of the examples for Verifiable Credentials and many of the implementations by commercial identity providers use Decentralized Identifiers (DIDs), but there is no requirement to use DID or the associated esoteric cryptography in a specific VC profile. (Indeed the VC profile for COVID-19 for vaccination does not use DIDs). The most significant problem with VCs are that there is no off-the-shelf mechanism for proof of possession of a private key, and no consensus to use VCs for user authentication (as opposed to using VCs to assert identity attributes).¶
While the examples in this document are represented as JSON, including whitespace, the actual JSON encoding used for VC has no whitespace.¶
The first example shows a fragment of the claims in a JWT-based VC proof, referencing the VCard ontology.¶
In the next example, there is a Verifiable Presentation (VP) JOSE header
and claims which contains two embedded VCs for the same holder. The JOSE
header contains an actual Ed25519 public key. The corresponding key id
could be expressed using the kid
type with a
urn:ietf:params:oauth:jwk-thumbprint:sha-256:
prefix, the actual fingerprint
value would be mJafqNxZWNAIkaDGPlNyhccFSAqnRjhyA3FJNm0f8I8
.¶
The first VC contains a full name and a handle-style identifier. It is created by one issuer (for example an identity provider), and uses standard claims from OpenID Connect Core. The second VC contains a client or device identifier and is created by a different issuer (the IM service).¶
Note that in the text version of this document, the jws
values and
verification Method
URLs are truncated.¶
Below are other mechanisms which were not investigated due to a lack of time.¶
This document requires no action by IANA.¶
TBC. (The threat model for interoperable IM systems depends on many subtle details).¶
The author wishes to thank Richard Barnes, Tom Leavy, Joel Alwen, Marta Mularczyk, Pieter Kasselman, and Rifaat Shekh-Yusef for discussions about this topic.¶