Draft Internet Multilink August 1993 INTERNET DRAFT Expires: February 15th, 1994 A Multilink Protocol for Synchronizing the Transmission of Multi-protocol Datagrams. Keith Sklower Computer Science Department University of California, Berkeley 1. Status of This Memo This document is an Internet Draft. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months. Internet Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appro- priate to use Internet Drafts as reference material or to cite them other than as a ``working draft'' or ``work in progress.'' Please check the 1id-abstracts.txt listing contained in the internet-drafts Shadow Directories on nic.ddn.mil, nnsc.nsf.net, nic.nordu.net, ftp.nisc.sri.com, or munnari.oz.au to learn the current status of any Internet Draft. 2. Abstract This document proposes a method for splitting and recombin- ing datagrams across multiple logical data links. Such facilities are desirable to exploit the potential for increased bandwidth offered by multiple bearer channels in ISDN, yet to do so in such away that minimizes reordering of packets. This is accomplished by means of new PPP [2] options and protocols. This draft incorporates comments arising at the 27th IETF meeting in Amsterdam. Sklower [Page 1] Draft Internet Multilink August 1993 3. Acknowledgements The author specifically wishes to thank Brian Lloyd of Lloyd & Associates, Fred Baker of ACC, Craig Fox of Network Sys- tems, Gerry Meyer of Spider Systems, Dave Carr of Gandalf, Tom Coradetti of Digiboard, and the members of the IP over Large Public Data Networks and PPP extensions working groups, for much useful discussion on the subject. 4. Conventions The following language conventions are used in the items of specification in this document: o MUST, SHALL or MANDATORY -- the item is an absolute requirement of the specification. o SHOULD or RECOMMENDED -- the item should generally be followed for all but exceptional circumstances. o MAY or OPTIONAL -- the item is truly optional and may be followed or ignored according to the needs of the implementor. 5. Introduction Basic Rate and Primary Rate ISDN both offer the possibility of opening multiple simultaneous channels between systems, giving users additional bandwidth on demand (for additional cost). Previous proposals for the transmission of internet protocols over ISDN have stated as a goal the ability to make use of this capability, (e.g. Leifer et al. [1]). There are proposals being advanced by communications providers for providing synchronization between multiple streams at the bit level (the BONDING proposals); such fea- tures are not as yet widely deployed, and may require addi- tional hardware for end system. Thus, it may be useful to have a purely software solution, or at least an interim mea- sure. Furthermore, even if the ISDN service providers guarantee bit-synchronization between different switched circuits, there may not be sufficiently flexible hardware in a partic- ular end system to combine the multiple bit streams in arbi- trary orders for HDLC bit-unstuffing. There are other instances where bandwidth on demand can be exploited, such as opening additional X.25 SVC where the window size is limited to two by international agreement. The simplest possible algorithms of alternating packets between channels on a space available basis (which might be Sklower [Page 2] Draft Internet Multilink August 1993 called the Bank Teller's algorithm) may have undesirable side effects due to reordering of packets. By means of a two-byte sequencing header, and simple syn- chronization rules, one can split packets among parallel virtual circuits between systems in such a way that packets do not become reordered, or at least the likelihood of this is greatly reduced. The method discussed here is similar to the multilink proto- col described in ISO 7776, but offers the additional ability to split and recombine packets, thereby reducing latency. Furthermore, there is no requirement here for acknowledged- mode operation on the link layer, although that is option- ally permitted. Any method for bandwidth aggregation would require some means of identifying which channels are to participate in such a process. This could be achieved specifically in the case of ISDN by use of the calling party information ele- ment, but more generally by using PPP's authentication pro- tocols [3]. For Frame Relay run over dedicated channels some sort of manual configuration could suffice. For X.25, the X.121 call request source address might alternatively be used. In order to use the multilink protocol in Frame Relay, and X.25 environments, one uses the encodings for PPP, which have been described elsewhere in a way compatible with RFC's 1294 and 1356. 6. General Overview The Point-to-Point Protocol (PPP) provides a standard method of encapsulating Network Layer protocol information over (virtual) point-to-point links. PPP has four main components: 1. A method for encapsulating datagrams over serial links. 2. A Link Control Protocol (LCP) for establishing, config- uring, and testing the data-link connection. 3. A family of Network Control Protocols (NCPs) for estab- lishing and configuring different network-layer proto- cols. 4. A family of Network-Layer Protocols (NPs) which are the classes of data to be transmitted themselves. In order to establish communications over a point-to-point link, each end of the PPP link must first send LCP packets Sklower [Page 3] Draft Internet Multilink August 1993 to configure the data link during Link Establishment phase. After the link has been established, PPP provides for an Authentication phase before proceeding to the Network-Layer Protocol phase. Since the idea is to ``tie together'' mul- tiple circuits between a fixed pair of systems, and since both of the current PPP authentication protocols permit the side effect of assigning identifiers to both systems, it makes sense to exploit the use of one or the other of the existing authentication protocols (or any future authentica- tion protocol that assigns unique identifiers to communicat- ing systems), rather than introducing a separate mechanism for naming multilink groups. We suggest that multilink operation can be modeled as a sep- arate PPP Network-Layer protocol (which we'll call the Mul- tilink Protocol, or MP) with its own control protocol (the Multilink Control Protocol, or MCP), which may be negotiated in the usual PPP manner. A slightly unusual convention (for PPP) might be to identify reconstituted packets as being transmitted over a separate logical link (which we'll call the bundle) which is distinct from the constituent physical links (which we'll call the member links). Thus, if the (PPP) Multilink (network-layer) Protocol were proposed and accepted in both directions on a physical link between two systems that had previously concluded negotiat- ing all network protocols to be carried over the bundle (group link) via another physical circuit, and all traffic sent over the new (member) link were encapsulated with mul- tilink headers, no additional negotiation would be required on the new link. Any changes to the state of the bundle from the default state MUST be made by encapsulating LCP or NCP packets inside Multilink Protocol packets. (I.e. by placing LCP or NCP packet headers after the multilink headers). However, we believe that it should almost never be necessary to re- negotiate the set of LCP-level defaults proposed for the bundle. Work is in progress on alternative means for memorizing the state of concluded negotations and re-invoking that state. 7. Packet Formats Fragments may be thought of as packets in a separate PPP protocol. Large packets are first encapsulated according to normal PPP procedures, and then are broken up into multiple frames sized appropriately for the multiple physical links. A new PPP header consisting of the Multilink Protocol Iden- tifier, and the Multilink header is inserted before each section. (Thus the first fragment of a multilink packet in PPP will have two headers, one for the fragment, followed by Sklower [Page 4] Draft Internet Multilink August 1993 the header for the packet itself). PPP multilink fragments are encapsulated using the protocol identifier 0x00-0x3d, followed by a two byte header giving a sequence number, Beginning Fragment of Packet bit, and End- ing Packet of Fragment bit. Unless Address & Control and Protocol ID compression are in effect, individual fragments will, therefore, have the following format: Sklower [Page 5] Draft Internet Multilink August 1993 Figure 1: Fragment Format. +---------------+---------------+ | Address 0xff | Control 0x03 | +---------------+---------------+ | PID(H) 0x00 | PID(L) 0x3d | +-+-+-+-+-------+---------------+ |B|E|0|0| sequence number | +-+-+-+-+-------+---------------+ | fragment data | | . | | . | | . | +---------------+---------------+ | FCS | +---------------+---------------+ The sequence field is a 12 bit number that is incremented every for every fragment transmitted. The (B)eginning fragment bit is a one bit field set to 1 on the first fragment and set to 0 for all other fragments. The (E)nding fragment bit is a one bit field set to 1 on the last fragment and set to 0 for all other fragments. The reserved field is 2 bits long and is not currently defined. It must be set to 0. In this multilink protocol, a separate and single reassembly structure is associated with each pair of authenticated entities. The multilink headers are interpreted in the con- text of this structure. It is permissible for a fragment to have both the (B)eginning and (E)ending fragment bits set. Two systems may implement multiple multilink groups by responding to multiple authentication identifiers, one for each group. 8. Trading Buffer Space Against Fragment Loss In a multilink procedure, where one channel may be delayed with respect to the other channel, fragments are may not arrive in the same sequence they left the sender. So, it is more difficult to determine that a fragment has been lost, and more difficult to estimate the amount of buffer space required. In this section we present a default strategy for minimizing the buffer space required for retaining enough fragments to determine that a fragment has been lost. The idea is that the receiver should keep track of the mini- mum of the sequence numbers of all fragments received on all Sklower [Page 6] Draft Internet Multilink August 1993 channels participating in the multilink procedure. (Call this M). We require that the senders MUST transmit frag- ments with increasing sequence numbers over any member link in a bundle. Thus, every time M advances past a fragment bearing an (E)nding bit that hasn't otherwise been reassem- bled and sent, one can discard all fragments with sequence number less than M, since any missing fragments will never arrive on any link due to the increasing sequence number rule. In the case that the last fragment transmitted over a member link before the link queue drains and the packet gets lost, the receiver will stall until new packets arrive. The sender may transmit a null fragment increasing the sequence number for each channel when the queue for each physical link becomes empty to attempt to reduce the likelyhood of this. The receive should implement some kind of idle/dead - link timer to resolve such cases. The amount of buffering required to guarantee correct recog- nition of fragment lost depends on the relative delay between the channels (D[c1,c2]), the number of channels par- ticipating (say N), the data rate of each channel R[c], the fragment size (F) and the maximum permissible reassembled size (P). When using PPP, the delay between channels can be determined by LCP echo request and echo reply packets. In the common case where the data rates are the same, one could define for each channel, its slippage to be the band- width times the delay for that channel relative to the slow- est, S[c] = R[c] * D[c, c-worst]. Given these conditions having buffer space of N*(P-F) + S[1] + S[2] + ... + S[n] should be sufficient to insure that you have not have erro- neously thrown away an incomplete packet before its complet- ing fragment arrives. 9. Multilink Control Protocol [Here, we shamelessly plagiarize RFC1220] The Multilink Control Protocol is responsible for establish- ing agreement to initiate multilink operation, and for set- ting a limited number of parameters. It is exactly the same as the Point-to-Point Link Control Protocol [2], with the following exceptions: Data Link Layer Protocol Field Exactly one Multilink Control Protocol packet is encap- sulated in the Information field of PPP Data Link Layer frames where the Protocol field indicates type hex 803d. Code field Only Codes 1 through 7 (Configure-Request, Configure- Sklower [Page 7] Draft Internet Multilink August 1993 Ack, Configure-Nak, Configure-Reject, Terminate- Request, Terminate-Ack and Code-Reject are used. Other Codes should be treated as unrecognized and should result in Code-Rejects. Precedence MCP packets may not be exchanged until the Link Control Protocol has reached the network-layer protocol config- uration negotiation phase, i.e. when link negotiation has concluded, and after any use of the Authentication Protocol. Systems implementing the Multilink Protocol MAY omit the use of the Authentication when trusted out-of-band identification is available, such as the presence of X.121 or E.164 addresses or when the opera- tion is conducted over leased lines. Network Control packets may also be exchanged over a member link concurrently with the exchange of Multilink Control Packets, however such exchanges pertain ONLY to the member links and not the bundle. NCP packets pertaining to the operation of the bundle MUST NOT be sent until the Multilink control exchange has concluded, and then MUST be transmitted inside mul- tilink packets (i.e. with the NCP headers following the multilink headers). This is discussed below in the section ``Interaction with Other Protocols.'' Configuration Option Types The Multilink Control Protocol has a separate set of Configuration Options. These permit the negotiation of the following items: o Sequenced Delivery o Reset on Loss o Maximum Received Reconstructed Unit o Maximum Received Completed Sequence (Prior to Loss) 9.1. Sequenced Delivery Option Figure 3: Sequenced Delivery Option 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type=1 | length = 3 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | On = 1 Off = 0| +-+-+-+-+-+-+-+-+ This option requests the system at the other end of the link not to deliver packets out of sequence. In the absence of reliable delivery over member links, a missing fragment would delay the release of any completed packet following it until the implementation unambiguously determined that the Sklower [Page 8] Draft Internet Multilink August 1993 fragment had been lost, such as in the procedure described in section 8 above. [Ed Note: It has been suggested that the number of outstand- ing incomplete packets, or the number of fragments or the total number of bytes should be negotiatiable. Certainly, one can calculate the minimum number ammount of space required based on slippage, round trip times, and the number of links in the group.] The Default value is ``Sequencing Not Required''. 9.2. Reset on Loss Figure 4: Reset on Loss Option 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type=2 | length = 3 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | On = 1 Off = 0| +-+-+-+-+-+-+-+-+ This option requests the system at the other end of the link to, upon detection of the loss of a fragment, re-enter the multilink-control-negotiation phase. The Default value is ``Don't Reset on Loss''. 9.3. Maximum Receive Reconstructed Unit Figure 5: Maximum Receive Reconstructed Unit Option 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type=3 | length = 4 | Maximum-Receive-Rebuilt-Unit | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ This option advises the peer that the implementation will be able to reconstruct a datagram consisting of the specified number of bytes. The default value is 8192 bytes. Note: this option controls the MRU of the logical (group) link, and could be negotiated by and LCP options exchange inside multilink packets. Having a protocol option at the MCP allows it to be piggy-backed onto other MCP options. Sklower [Page 9] Draft Internet Multilink August 1993 9.4. Maximum Received Completed Sequence Figure 5: Maximum Received Sequence Number 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type=4 | length = 4 | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ This option is used after an implementation reenters MCP after suspecting a fragment loss. The sequence number is that of the last complete datagram successfully reassembled prior to the detection of the loss. 10. Synchronization When sequenced delivery is in effect, there are issues about signaling loss, and graceful termination of a constituent link among several in a group, if it has been determined that the additional bandwidth is no longer needed. We propose a mechanism that can be use for both purposes. Any time that MCP has been successfully completed, on any constituent link, an implementation may send a MCP configu- ration request including the ``Maximum Received Sequence'' (MRS) option. The implementation must not send any further multilink packets until both sides have sent and received configure-ack packets control packets. Upon receipt of a config-request with a MRS option, the peer implementation must eventually reply with either a config- request including the MRS option, or a config-ack. The peer may also delay send the ack up to twice the round trip delay time. The peer MUST NOT begin transmitting multilink pack- ets over the constituent link until the initiating implemen- tation sends a config-ack. Both the implementation and its peer MAY continue sending traffic on the other participating links in the group however. When permitted to resume sending packets, an implementation may (re-)transmit a packet with a lower sequence than was last transmitted prior to re-entry into the MCP on that mem- ber link, provided that the window sizes and quantities M are known to both systems, such that the packet cannot be interpreted as sequence space wrap after an extensive loss. If the constituent link is to be closed, each peer may send terminate-request packets after verifying that all fragments transmitted prior to dropping back into control protocol mode had been received. Sklower [Page 10] Draft Internet Multilink August 1993 11. Default State of the Group Link In order to minimize the number of packet exchanges in establishing a multilink group, and in order to minimize the amount of overhead in using the logical (group) link, when use of the Multilink Protocol is negotiated, the following defaults are assumed for the bundle (logical group link): o No async control character Map o No Magic Number o No Link Quality Monitoring o Address and Control Field Compression o Protocol Field Compression o Maximum Receive Unit of 8192 Thus, after the first link in a bundle concludes the MCP negotiation, a (logical) group link is identified between the authenticated identities, and is declared already to be in the PPP open state with the above defaults listed, but with no network level protocols negotiated. If the above defaults are unsatisfactory, LCP config requests packets could be sent embeded in multilink packets, and the bundle (logical group link) would fall out of the open state. If the above states were acceptible, then no further exchange of LCP packets over the bundle would be required. So long as any member links in the bundle are active, the PPP state for the bundle persists as a separate entity. 12. Interaction with Other Protocols In the common case, LCP, Authentication CP, and the Multi- link Control Protocol would be run over each member link. The Network Protocols themselves and associated control exchanges would normally be run on the bundle, and given a suitable choice of defaults for the bundle (as stated in the previous section), no further LCP exchanges over the bundle should be necessary. In some instances it may be desireable for some Network Pro- tocols to be exempted from sequencing requirements, and if the MRU sizes of the link did not cause fragmentation, those protocols could be negotiated directly over the member links. If there were several member links between two implementa- tions and independent sequencing of two protocol sets was desired, but blocking of one by the other was not, one could describe two multilink procedures by assigning multiple authentication names to the same systems. Each member link, however, would only belong to one bundle. Sklower [Page 11] Draft Internet Multilink August 1993 The following table may help clarify common practice: Table 1. Where Packet Types Are Carried. On Member Links On the Bundle _________________________________________ LCP (R) NP/NCP_a (R) Multilink CP (R) Compression (C) Authentication NCP (A) Link Quality NCP (H) LAPB (H) Compression (C) NCP/NP_b (E) _________________________________________ Notes: (R) Required. (If there are no network nor bridged proto- cols nor compression packets carried, why bother with multilink?) (A) Authentication is waived for reliable out-of-band nam- ing. (C) Compression may be done on the constituent links or on the bundle. Running compression over the bundle may ease memory requirements. Compression usually requires reliable link operation. We indicate the desire to run a common compression accross links by placing the CCP packets after multilink headers. (H) These procedures are helpful in the case of compres- sion, but not necessarily required. (E) This case is unusual, but permitted, and could be used to exempt traffic from sequencing requirements. 13. References [1] Leifer, D., Sheldon, S., and Gorsline B., "A Subnetwork Control Protocol for ISDN Circuit-Switching" IPLPDN Working Group, Internet Draft (Expired), March 1991. [2] Simpson, W., "The Point-to-Point Protocol (PPP) for the Transmission of Multi-protocol Datagrams over Point-to- Point Links", Network Working Group, RFC-1331, May 1992. [3] Lloyd, B., Simpson, W., "PPP Authentication Protocols", Networking Working Group, RFC-1334 Sklower [Page 12] Draft Internet Multilink August 1993 [4] Bradley, T., Brown, C., and Malis, A., "Multiprotocol Interconnect over Frame Relay", Network Working Group, RFC-1294, January 1992. [5] Malis, A., Robinson, D., Ullman R., "Multiprotocol Interconnect on X.25 and ISDN in the Packet Mode", Net- work Working Group, RFC-1356, August 1992. [6] Internation Organisation for Standardization, "HDLC - Description of the X.25 LAPB-Compatible DTE Data Link Procedures", Internation Standard 7776, 1988 [7] Simpson, W., ``PPP over Frame Relay'', Networking Work- ing Group, work in progress. [8] Simpson, W., ``PPP over X.25'', Networking Working Group, work in progress. 14. Author's Address Keith Sklower Computer Science Department 570 Evans Hall University of California Berkeley, CA 94720 Phone: (510) 642-9587 E-mail: sklower@cs.Berkeley.EDU 15. Expiration Date of this Draft February 15th, 1994 Sklower [Page 13]