Internet-Draft | BGP over QUIC Streams | July 2023 |
Retana, et al. | Expires 11 January 2024 | [Page] |
This document specifies the use of QUIC Streams to support multiple BGP sessions over one connection in order to achieve high resiliency.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 11 January 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The Border Gateway Protocol (BGP) [RFC4271] is the routing protocol used to exchange routing and reachability information among autonomous systems. BGP uses TCP as its transport protocol to provide reliable communication. BGP establishes peer relationships between routers using a TCP session on port 179.¶
The Multiprotocol Extensions for BGP-4 (MP-BGP) [RFC4760] allow BGP to carry information for multiple Network Layer protocols. However, only a single TCP connection can reach the Established state between a pair of peers [RFC4271]. As a consequence, an error related to a particular Network Layer protocol may result in the termination of the connection for all.¶
QUIC [RFC9000] is a UDP-based multiplexed and secure transport protocol that provides connection-oriented and stateful interaction between a client and server. It can provide low latency and encrypted transport with resilient connections.¶
In QUIC, application protocols exchange information using streams. Each stream is a separate unidirectional or bidirectional channel consisting of an ordered stream of bytes. Moreover, each stream has its own flow control, which limits bytes sent on a stream, together with flow control of the connection.¶
This document specifies the procedures for BGP to use QUIC as a transport protocol with a mechanism to carry Network Layer protocols (AFI/SAFI) over individual streams. The Network layer protocols are identified using a combination of Address Family (AF) and Subsequent Address Family (SAFI), as described in [RFC4760]. These per-AFI/SAFI streams (function channels) and the associated control mechanism (control channel) for the session are called "BGP channels". In one BGP over QUIC (BoQ) connection, one control channel and one or more function channels are used to carry routing information.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Before two BoQ speakers start exchanging routing information, they must establish a BGP session. It is established in two phases:¶
QUIC connections are established as described in [RFC9000]. During connection establishment, a BoQ speaker SHOULD use UDP port TBD1 and MUST select the Application-Layer Protocol Negotiation (ALPN) [RFC7301] token "boq" in the TLS handshake. Support for other application-layer protocols MUST NOT be offered in the same handshake. A connection MUST be closed if the ALPN token is not as indicated or if other application-layer protocols are offered in the same handshake.¶
[RFC4271] defines the operations for a single BGP session between two BGP speakers using TCP. This document defines the ability to carry BGP over multiple QUIC streams as "BGP channels".¶
On a BoQ connection, the BoQ speaker first establishes a bidirectional stream for the "BGP control channel". The control channel is used to establish a BGP peer relationship between two BoQ speakers, similar to RFC 4271. OPEN messages are exchanged on the control channel, and if the BoQ session parameters are acceptable, the peering session is established. Similar to RFC 4271, the BoQ session is terminated with a NOTIFICATION message if the parameters are unacceptable.¶
After establishing the control channel, each BoQ speaker may create function channels using unidirectional QUIC streams. These function channels are used to carry BGP routes for a specific AFI/SAFI. Only one function channel per AFI/SAFI can exist from one BoQ speaker to another (see "Channel Collision Avoidance"). Unlike [RFC4271] BGP, there is no requirement for both BoQ speakers to have a symmetric and matching set of function channels.¶
BGP channels largely use the mechanisms of the RFC 4271 Finite State Machine (FSM) for their establishment. For the control channel carried over a bidirectional QUIC stream, the FSM is identical to the RFC 4271 FSM. However, since the function channels are unidirectional, the RFC 4271 FSM procedures cannot be carried out solely using the unidirectional channel from one BoQ speaker to another. Instead, the responding BoQ speaker must carry its replies for the unidirectional streams over the control channel and address them to a specific BGP function channel.¶
After BoQ session establishment, the BoQ speakers will create the control channel. The control channel is a bidirectional QUIC stream with stream ID 0 [RFC9000]. It is created by sending a BGP OPEN message. BGP OPEN messages carry parameters such as the Autonomous System number, BGP Identifier (router-id), Hold Time, and Capabilities. These parameters are used by a BoQ speaker to decide whether a BGP session is permitted to be established.¶
The capabilities carried in this OPEN message for the control channel are the BoQ connection-specific parameters; i.e. those that apply to the entire connection. An example of this is the BGP Role Capability [RFC9234]. If a function-only capability - as categorized in Table 1 - is included in the OPEN message, it MUST be ignored.¶
The control channel uses BGP hold time procedures as specified in [RFC4271]. KEEPALIVE messages are sent periodically in the absence of other messages on the control channel. If no messages are received within the negotiated hold time on the control channel, the BoQ connection is closed with a NOTIFICATION as specified in Section 6.5 of [RFC4271]. In short, the BoQ control channel is used to establish the peering relationship and connection parameters between the two BoQ speakers, ensure connectivity over this session is verified, and further is used as the response channel for the function channels as specified in Section 4.3.¶
It is an error to exchange BGP routing information over the control channel. This functionality is reserved for the Per-AFI/SAFI Function channels. If BGP routes are received on the control channel, the receiving BGP speaker MUST send a BGP NOTIFICATION with a Cease code on the control channel and close the QUIC connection.¶
QUIC supports connection migration. However, only the client side can move. The role of the QUIC endpoints is important. For future extensibility, a new BoQ Capability indicates the configured role of the BoQ speaker: Client, Server, or Any. It is expected that the BGP configuration and QUIC roles match. The QUIC connection can be reset if they don't. See Section Section 5.1 for details.¶
Per-AFI/SAFI Function channels are used to exchange routing information. After the control channel reaches the Established state, function channels are created as unidirectional QUIC streams and advertise routes for a single AFI/SAFI using BGP UPDATE messages. Only one function channel per AFI/SAFI can exist from one BoQ speaker to another (see Section 5.3).¶
It is an error to try to establish Per-AFI/SAFI Function channels prior to the control channel transitioning to the Established state. Per-AFI/SAFI Function channels SHOULD NOT be permitted to transition to the Established state prior to the control channel itself entering the Established state.¶
BoQ speakers asymmetrically create their function channels. While it might be the typical case for there to be a symmetric set of per-AFI/SAFI function channels, one for each speaker, this is not a requirement. For example, BGP-LS [I-D.ietf-idr-rfc7752bis] may only require that a BoQ speaker asymmetrically receive BGP-LS NLRI and may not need to send them.¶
A BoQ speaker that needs to advertise routes to its peer opens a unidirectional stream to its neighbor by sending an OPEN message indicating the particular AFI/SAFI to be used. The BoQ connection-wide parameters have previously been exchanged over the control channel. The function channel OPEN messages MUST contain identical BGP Autonomous System number and BGP Identifier as the previously established control channel. It is RECOMMENDED that the BGP Hold Time value exchanged in the function channels be significantly longer than the hold time negotiated for the control channel. It is the responsibility of the hold timer for the control channel to provide connection verification for the BoQ connection. The purpose of the function channel negotiated hold time is to provide verification of the communication between the two BoQ speakers for that AFI/SAFI.¶
The BGP Capabilities carried on the function channel SHOULD only be those that are function-specific, as categorized in Table 1. Conflicting BoQ connection-wide parameters exchanged over the function channel MAY result in the BoQ speaker sending a NOTIFICATION message and not permitting the per-AFI/SAFI function channel to become Established.¶
The receiving BoQ speaker replies to those messages as defined in the [RFC4271] FSM by sending its messages (OPEN/NOTIFICATION/KEEPALIVE) addressed to the sender over the control channel.¶
Once the function channel has reached the Established state, BGP UPDATE messages may be sent to the remote BoQ speaker.¶
A single function channel for an AFI/SAFI pair results in asymmetric route advertisements. Both BoQ speakers can create a function channel to implement symmetric route advertisements.¶
Each function channel is created independently to naturally support multi-channel BGP. The neighbor state machines are decoupled; in case of error, it is possible to reset only one function channel (one direction of a symmetric route exchange) using BoQ Error Message with code BoQ Chanel Reset (see Section 6). If one function channel is blocked for some reason, other channels can still progress and operate.¶
A NOTIFICATION is sent over the control channel if the entire BGP connection needs to be reset for any reason, such as a configuration change or a network outage. Existing error messages defined by [RFC4271] and other various extensions SHOULD be used.¶
If the control channel is closed, the QUIC connection MUST be terminated using a CONNECTION_CLOSE frame, and an error notification (see Section 6) should be included to indicate that the connection has been terminated by BGP. If there are other open channels, they are also closed when the connection is closed.¶
A function channel can be reset independently without impacting any other function channels or the control channel. Please refer to Section 6¶
A single QUIC stream provides ordered and reliable delivery. However, there is no guarantee of transmission and delivery orders across streams. Therefore, if specific data from one channel needs to be received before data from other channels, this requirement must be accomplished through BGP.¶
As defined in [RFC9000], a QUIC implementation SHOULD provide ways in which an application can indicate the relative priority of streams.¶
A BGP implementation utilizing QUIC as its transport protocol MUST support a prioritization mechanism for BGP streams. This is essential for ensuring that critical routing information can be transmitted with higher priority compared to non-routing information.¶
How to implement the supported priorities using QUIC congestion control at the connection level, stream level flow control, and packetization are out of the scope of this document.¶
QUIC supports connection migration. However, only the client side can move. For a BoQ speaker to take advantage of the QUIC connection migration capability, it has to be the QUIC client.¶
For an implementation of the BoQ defined in this document, an explicit configuration is needed to identify a BoQ speaker's role: a QUIC client, a QUIC server, or any (Don’t care). The default value can be "any"; other values MUST be explicitly configured.¶
A new ”BGP over QUIC” capability is defined below to signal whether the BoQ speaker is a QUIC client, a QUIC server, or any (Don’t care).¶
BoQ capability: Code: TBD2 (to be assigned by IANA) Length: 1(octet) Value: 0 Any 1 Client 2 Server¶
The BoQ Capability is a control-only capability (see Table 1), which means it SHOULD only be sent in the control channel. It MUST be ignored if received in the OPEN message of any function channel.¶
A BoQ session MUST be terminated if the BoQ speaker role configuration and the QUIC connection role doesn't match by sending a NOTIFICATION on the control channel with an error code of BGP over QUIC Message Error and a Subcode BoQ Capability Mismatch, the close the QUIC onnection with a CONNECTION_CLOSE frame with an error code of APPLICATION_ERROR. Please refer to section 19.19 in [RFC9000] . For example, if a BoQ speaker is configured as a client, but the QUIC connection comes up as a QUIC server, the QUIC connection must be terminated. The "any" configuration matches both the QUIC client and QUIC server roles.¶
Before initiating a QUIC connection for BGP, the BoQ role configuration MUST be checked. If a BoQ speaker is configured as QUIC client, it MUST try to initiate the QUIC connection. If a BoQ speaker is configured as QUIC server, it MUST wait for a QUIC connection.¶
The following collision avoidance procedure SHOULD be followed during QUIC connection setup:¶
During the control channel setup, the BoQ capability MUST be checked to make sure the configured BoQ role matches the QUIC connection. When both BoQ peers are configured as "any", existing session collision mechanism defined in [RFC6286] and [RFC4271] MUST be followed.¶
In case there is a BoQ role mismatch, for example, a BoQ speaker configured as any accepted a QUIC connection from a BoQ speaker configures as server, an error notification, BGP Capability Mismatch, SHOULD be sent and the connection MUST be terminated. Please refer to Section 6 for detail.¶
For existing BGP capabilities, some of of them apply to the entire connection and MUST be sent in the control channel OPEN message, such as the BGP Role defined in [RFC9234]. If such capabilities are sent in an OPEN message in a function channel, they MUST be ignored.¶
The following table shows the category of each capability.¶
Value | Name | Ref | Control/Function |
---|---|---|---|
1 | Multiprotocol Extensions for BGP-4 | RFC2858 | F |
2 | Route Refresh Capability for BGP-4 | RFC2918 | F |
3 | Outbound Route Filtering Capability | RFC5291 | F |
5 | Extended Next Hop Encoding | RFC8950 | F |
6 | BGP Extended Message | RFC8654 | C/F |
7 | BGPsec Capability | RFC8205 | C/F |
8 | Multiple Labels Capability | RFC8277 | C - deprecated |
9 | BGP Role | RFC9234 | C |
64 | Graceful Restart Capability | RFC4724 | C/F |
65 | Support for 4-octet AS number capability | RFC6793 | C/F |
67 | Support for Dynamic Capability (capability specific) | draft-ietf-idr-dynamic-cap | C/F |
68 | Multisession BGP Capability | draft-ietf-idr-bgp- multisession | Not compatible |
69 | ADD-PATH Capability | RFC7911 | F |
70 | Enhanced Route Refresh Capability | RFC7313 | F |
71 | Long-Lived Graceful Restart (LLGR) Capability | draft-uttaro-idr-bgp- persistence | C/F |
72 | Routing Policy Distribution | draft-ietf-idr-rpd | ??? |
73 | FQDN Capability | draft-walton-bgp- hostname-capability | C |
74 | BFD Capability | draft-ietf-idr-bgp-bfd- strict-mode | C |
75 | Software Version Capability | draft-abraitis-bgp- version-capability | C/F |
A function channel for a specific Network layer protocol MUST NOT be created if one already exists.¶
If a BoQ speaker receives a function channel creation request for an AFI/SAFI that already exists, the local BoQ speaker SHOULD send a notification with Error Code BoQ and subcode BoQ Channel Conflict through the control channel, and upon receiving this notification the channel initiator MUST terminates the channel.¶
If a BoQ speaker receives a functional channel creation request for an AFI/SAFI that it doesn't support, the local BoQ speaker SHOULD send a notification using existing subcode "Unsupported AFI/SAFI" in the OPEN Message Error NOTIFICATION message.¶
Unless allowed via configuration, a channel collision with an existing BGP channel in the Established state causes the closing of the newly created channel.¶
In version 1 of QUIC, BoQ messages are carried by QUIC STREAM frames. In BoQ, the control channel always uses QUIC stream 0, which is a client-initiated bidirectional stream. Function channels, which are unidirectional streams, can be client or server initiated.¶
Some BoQ messages, although sent in the control channel, are meant for a function channel, such as the responding OPEN message or KEEPALIVE message for a function channel. These messages need to carry the corresponding function channel/stream ID information.¶
There are two types of BoQ Frames: Data and Control Data.¶
Data frames have the following format:¶
BoQ Data Frame Format { Type (0), Length (), Frame Payload (...) }¶
Control Data Frames have the following format:¶
BoQ Control Data Frame Format { Type (1), Length (), Stream ID (), Frame Payload (...) }¶
Type: One octet, it identifies the frame type.¶
Length: The two-byte unsigned integer that describes the length in bytes of the frame payload.¶
Stream ID: A Variable-length integer indicating the receiving stream ID of this message.¶
Frame Payload: BGP messages.¶
The following table lists the frame type to be used when BGP messages are sent in different channels.¶
Control Channel | Function Channel | |
---|---|---|
OPEN | Control Data | Data |
UPDATE | / | Data |
KEEPALIVE | Control Data | Data |
NOTIFICATION | Control Data | Data |
Route-Refresh | Control Data | / |
OPEN message sent in the control channel for the control channel creation MUST NOT contain Multiprotocol Extensions Capability (value 1) in the Capabilities. OPEN message sent in a function channel and the responding OPEN message sent in the control channel for one AFI/SAFI MUST contain only one Multiprotocol Extensions Capability (value 1) in the Capabilities.¶
There is no UPDATE message sent in the control channel.¶
For the KEEPALIVE and NOTIFICATION messages sent in the control channel for one function channel, the BoQ Control Data frame MUST be used, and the stream ID in the frame is to indiate the the target AFI/SAFI.¶
Route-refresh messages are sent in the control channel using BoQ Control Data Frame.¶
OPEN message error handling is defined in section 6.2 of [RFC4271]. This document defines a new NOTIFCATION error code:¶
Error Code Name TBD BoQ Message Error¶
The following error subcode is defined as well:¶
Subcode Name 1 BoQ Capability Mismatch 2 BoQ Connection Reset 3 BoQ Channel Reset 4 BoQ Channel Conflict¶
BoQ Capability Mismatch is sent when a BoQ speaker's configured role doesn't match the QUIC connection, and the connection MUST be terminated after sending this notification. Details are described in Section 5.1¶
The error handling specified in this section is applicable for a BoQ speaker implementing this document.¶
Any individual BGP channel can be terminated as specified in [RFC4486].¶
TBD.¶
The decision to use BoQ instead of the TCP-based mechanism defined in [RFC4271] is an operational decision and out of the scope of this document. An implementation MUST provide a configuration mechanism to enable BoQ on a per-peer basis.¶
Connectivity problems (e.g., blocking UDP) can result in a failure to establish a QUIC connection; BGP speakers SHOULD attempt to establish a TCP-based BGP session in this case.¶
One of the drawbacks of a single BGP session is that control plane messages for all supported Network Layer protocols use the same connection, which may cause resource contention.¶
QUIC [RFC9000] does not provide a mechanism for exchanging prioritization information. Instead, it recommends that implementations provide ways for an application to indicate the relative priority of streams, in this case, mapped to BGP channels. An operator should prioritize BGP channels (streams) that carry critical control plane information if the functionality is available. The definition of this functionality and the determination of the importance of a BGP session are both outside the scope of this document.¶
This document replaces the transport protocol layer of BGP from TCP to QUIC. It does not modify the basic protocol specifications of BGP, and therefore does not introduce new security risks to the basic BGP protocol. The non-TCP-related considerations of [RFC4271], [RFC4272], and [RFC7454] apply to the specification in this document.¶
BoQ enhances transport-layer security for BGP sessions, refer to [RFC7454]:¶
The use of a specific UDP port number and an ALPN token protects a BoQ speaker from attempts to establish an unexpected BGP session. Additionally, all packets directed to UDP port TBD on the local device and sourced from an address not known or permitted to become a BGP neighbor SHOULD be discarded.¶
With BGP multi channel support using QUIC streams, it separates the control plane traffic over multiple channels, the effect of a session-based vulnerability is reduced; only a single channel is affected and not the whole connection. The result is increased resiliency.¶
On the other hand, a high number of BGP channels may result in higher resource utilization and the risk of depletion. Also, more channels may imply additional configuration and operational complexity.¶
IANA is requested to assign a UDP port (TBD1) from the "Service Name and Transport Protocol Port Number Registry" as follows:¶
Service Name | boq |
Port Number | TBD1 |
Transport Protocol | udp |
Description | BGP over QUIC |
Assignee | IETF |
Contact | IDR WG |
Registration Data | TBD |
Reference | this document |
Unauthorized Use Reported | [email protected] |
This document creates a new registration for the identification of BGP [RFC4271] in the "TLS Application-Layer Protocol Negotiation (ALPN) Protocol IDs" registry.¶
The "boq" string identifies BGP-4 [RFC4271] over QUIC:¶
Protocol: Multi-Channel BGP over QUIC Identification Sequence: 0x62 0x6f 0x71 ("boq") Specification: This document¶
IANA is asked to assign a new Capability code [RFC5492] for the BGP over QUIC Capability Section 5.1 as follows:¶
Value | TBD2 |
Description | BoQ Capability |
Reference | [This Document] |
Change Controller | IETF |
This document defines a new NOTIFICATION error code and related subcodes related to the BoQ procedures. IANA is asked to assign a new error code from the "BGP Error (Notification) Codes" registry with the name "BGP over QUIC Message Error", referencing this document.¶
IANA is asked to create a new registry for the error subcodes as follows:¶
Under "Border Gateway Protocol (BGP) Parameters", under "BGP Error Subcodes": Registry: "BGP over QUIC Message Error subcodes" Reference: this document Registration Procedure(s): Values 0-127 Standards Action, values 128-255 First Come First Served¶
Value | Name | Reference |
---|---|---|
0 | Reserved | [this document] |
1 | BoQ Capability Mismatch | [this document] |
2 | BoQ Connection Reset | [this document] |
3 | BoQ Channel Reset | [this document] |
4 | BoQ Channel Conflict | [this document] |
5-255 | Unassigned |
This document references the text and procedures defined in [I-D.ietf-idr-bgp-multisession], and we are grateful for their contributions.¶
The authors would like to thank xx for review and comments.¶