INTERNET DRAFT Marko Kaittola FUNET Jul 04, 1993 Expires: January, 1994 Mail based file distribution Part 1: Dialog between two nodes Marko Kaittola FUNET Abstract Mapping between X.400 and Internet mail and X.400 routing are normally done using a table-based approach. In practise tables are normal (ASCII) files. In order to function properly tables must be coordinated carefully. One major problem is the lack of automated procedures. This memo - together with it's companion document - proposes one possible solution. This memo discusses the transactions between two nodes, while the companion document discusses the over-all structure aspects. The same solution can be used to transport binary files. This way it is possible to mirror an entire archive with an e-mail only connectivity. Status of this memo This document is an Internet Draft. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months. Internet Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material or to cite them other than as a ``working draft'' or ``work in progress.'' Please check the 1id-abstracts.txt listing contained in the internet-drafts Shadow Directories on nic.ddn.mil, nnsc.nsf.net, nic.nordu.net, ftp.nisc.sri.com, or munnari.oz.au to learn the current status of any Internet Draft. Kaittola Expires: January, 1994 [Page 1] DRAFT File distribution - dialog DRAFT Distribution of this memo is unlimited. Table of contents Abstract 1 Status of this memo 1 Table of contents 2 1. Introduction 3 1.1. About security 3 1.2. Terminology 4 2. Notation 5 3. Parsing 6 4. Dialog 8 4.1. Ihave 10 4.2. Sendme 13 4.3. Data 15 4.4 List 18 4.5. Ping 19 4.6. Pong 19 4.7. Contents of data 20 5. Coding and checksums 23 5.1. Some background 23 5.2. Calculating the checksum 23 6. Security consideratins 26 References 26 Acknowledgements 26 Author's address 26 Appendix A: BNF summary - initial parsing 28 Appendix B: BNF summary - message structure 29 Appendix C: BNF summary - data message parsing 36 Appendix D: Checksum function by Pasi Ojala 38 Kaittola Expires: January, 1994 [Page 2] DRAFT File distribution - dialog DRAFT 1. Introduction This paper defines a dialog between two nodes. [DST-PICTURE] defines a way how this dialog is used. These two documents should be read together. A third document should also be written. That document [DST-MHS] should explain how the other two documents are applied to the needs of X.400 world. This paper has grown from a real need: mapping and routing information in the international R&D X.400 network must be distributed automatically. The most obvious solution would be to use FTP, but unfortunately this isn't always possible. Indeed, the only common denominator seems to be some kind of e-mail connectivity, so this is what the distribution must be based on. Naturally, other distribution techniques may (and most probably will) be used in addition to this. Using a directory (either dns or X.500) is a natural and attractive alternative. This, however, is not yet realistic in a global scale. This proposal tries to fulfill the following requirements: - Files must be distributed rather quickly. In practice this means some minutes from one node to another. Files should reach even their final destination within a couple of hours. - Transport errors and forged files must be automatically identified (and appropriate actions taken). - Management must be simple and require very little human effort. - Distribution must be based on e-mail. This task is simplified by the fact that files are normally public in a sense that anyone may fetch and see them. 1.1. About security This memo defines a secure way for two nodes to communicate together. The over-all structure is discussed in the companion document [DST-PICTURE]. Security is based on a three-phase dialog. The first phase in the initiation of the dialog. It is possible to fake an initiation message ("Ihave message"), but doing so doesn't compromise the security. Kaittola Expires: January, 1994 [Page 3] DRAFT File distribution - dialog DRAFT In the second thase the client send a key to the server. It is expected that although the source of an e-mail is impossible to know the target can be known. Thus the client can be sure that the key is sent to a real server that has been pre-configured on it's database. In the third phase the server respondes. The authentity of the message is checked using the key defined in the second phase. Neither line tapping nor system security are considered. It is expected that if one can listen to the traffic on a line or become a super user one can break the file distribution any way he likes. Doing the key management as part of the dialog simplifies it greatly. 1.2. Terminology A client is a node that receives files and a notification about new files. A server is a node that sends files and a notification about new files. If there is a bi-directional link between any two nodes they can both be clients and servers to each other. Terms client and server are relative to a particular pair of nodes. Kaittola Expires: January, 1994 [Page 4] DRAFT File distribution - dialog DRAFT 2. Notation | choice [] optional end of line (it can be Carriage Return or Line Feed or anything, system specific) () group named rule * repeated zero or more times *n repeated at most n times m*n repeated between m and n times m* repeated at least m times -- start of comment letter (a-z), case-insensitive digit (0-9) letter (a-z), digit (0-9) or hyphen ("-"), case-insensitive white space (ie. space and tab) space (" ") any single ASCII character except <(> open parenthis <)> close parenthis <[> open brace <]> close brace anything else literal Implementation must be case insensitive with one exeption: file (and directory) names are case sensitive. Kaittola Expires: January, 1994 [Page 5] DRAFT File distribution - dialog DRAFT 3. Parsing This chapter explains how the parsing of a message is done. Although it is logically done in multiple phases there is no reason why all of these logical phases couldn't be in practice combined into one run. In general a mail message consists of headers and a body. (And an envelope, too. But as an envelope bolongs to MTAs and not UAs it can safely be ignored in this context.) Parsing starts by stripping the headers away. In the second phase trailing white spaces will be removed. In the third phase comments will be removed. A comment is a line that starts with a hash (#), or formally ::= # * -- A line starting with a "#" is -- a comment. In the fourth phase empty lines will be removed. In the fifth phase the folding of lines will be undone. A logical line may be folded into multiple physical lines. In practise this will occur if the original (logical) line is considered to be too long. A folded line is formally defined like this: ::= 1* ( \ 1* 1* )1* -- Note that there are no -- insignificant spaces before -- the backslash. If there is a space -- it is significant. Spaces in front -- of the continuation line(s) are -- insignificant and will be removed -- while putting the physical lines -- together. For example: This is an \ example on \ how li\ nes can be folded. produces This is an example on how lines can be folded. Kaittola Expires: January, 1994 [Page 6] DRAFT File distribution - dialog DRAFT As can be seen, a line can be folded even in the middle of a word. Finally, when all of this is done the "real" parsing starts (in the sixth phase). This is defined in chapter four. Please note that chapter four only deals with the sixth phase and the logical lines. Kaittola Expires: January, 1994 [Page 7] DRAFT File distribution - dialog DRAFT 4. Dialog There are normally three phases in a normal ("ihave-sendme-data_or_command") dialog, but it can also start from the phase 2. Phases 2 and 3 can also be replased by using FTP. Doing the dialog in three phases helps to keep it secure, but it also makes the distribution on a large network (relatively) fast and it saves network bandwith. Graphically it can be presented as in the picture 1: SERVER | | CLIENT | | Send a notification |- | | \ | | 1 | IHAVE | \ | | -| Receive a notification | | | -| Send a request | / | | 2 | SENDME | / | Receive a request |- | | | Send files or |- | command | \ | | 3 | DATA or COMMAND | \ | | -| Receive files or | | command | | Picture 1 Phase 1: The server sends a message to a client informing it about the availability of new files or commands. There is no authentication in this phase, although the client may try to see if the notification seems to come from an approved source. Phase 2: The client sends a request to the server. In this request the client specifies a key that the server must use when it replies. The client also stores the key in its local data base. Kaittola Expires: January, 1994 [Page 8] DRAFT File distribution - dialog DRAFT Phase 3: The server responds to the client. Usually this means sending the requested files, but occasionally the server may tell that the files are not available. If the client is requesting a command and it is available, the server responds with the requested command if it is available. When the client receives a reply it checks the validity of that submission based on the key specified in phase 2. There are also variations of the dialog ("ping-pong" and "list-data"). These are presented in the picture 2. SERVER | | CLIENT | | | -| Send a ping message | / | | 1 | PING | / | Receive a ping message |- | | | Send a ping reply |- | | \ | | 2 | PONG | \ | | _| Receive a ping reply | | | | | | | -| Send a list request | / | | 1 | LIST | / | Receive a list request |- | | | Send a listing |- | | \ | | 2 | DATA LIST | \ | | -| Receive a listing | | Picture 2 Phase 1 (ping): The client sends a ping message for test purposes. Phase 2 (pong): The server responds with a pong message. There is no authorization in this dialog Kaittola Expires: January, 1994 [Page 9] DRAFT File distribution - dialog DRAFT (but naturally there is validation). Phase 1 (list): The client sends a list message to receive a listing of available files. Phase 2 (data): The server responds with a data message. This message contains the required listing or an error message. As in a pong message, there is no authorization but there is validation. Doing the distribution in one phase (just sending files) has the following important drawbacks: - If there are loops on a distribution a client will receive multiple copies of the same file. If the file is big this is serious wasting of capasity. - Security must be based on keys that have been set up by hand and that are managed by hand. This is an administrative headache. - It is not possible to request a missing file. Doing the distribution in two phase (just request a file and wait for it) has the following important drawbacks: - It is a slow way to distribute information. - If there are loops on a distribution a client will receive multiple copies of the same file. If the file is big this is serious wasting of capasity. - It isn't possible to fast send a file to replase an incorrect version of it. 4.1. Ihave The server sends an IHAVE message to inform the recipient(s) about the file(s) or command(s) it has received or made available. The following information will be present: - file name - version number - name of distributor It should be noticed that although file names and directory names are case sensitive not all systems support this. The following conventions are strongly suggested: - File names are written in small letters. - Directory names are written in capital letters. It should be noted that a special directory "Cmds" with it's Kaittola Expires: January, 1994 [Page 10] DRAFT File distribution - dialog DRAFT subdirectories is reserved to transmit commands. All of it's subdirectories must be written in capital letters. Naturally it is a local matter on how files and directories are stored locally. For example: # Directory/name IHAVE: FILE TXT COSINE-MHS/mapping-1 # Version number is HHMMSS-DDMMYY VERSION: 930317-121303 FTP: nic.switch.ch:/cosine-mhs/mappings/mapping-1;\ anonymous;user@domain # RFC-addr between a pair of "<>", OR-addr using /-notation. IAM: or # Command IHAVE: CMD Cmds/COSINE-MHS/new-file VERSION: 930424-020127 IAM: /S=cosine-server/OU=cosine-mhs/O=switch/PRMD=SWITCH/\ ADMD=ARCOM/C=CH/ Formal definition: ::= -- IHAVE message ( [] )1* ::= -- Commands are given in a special -- file IHAVE: * ( FILE ( TXT | BINARY ) | CMD ) * ::= * -- File name is a symbolic name, -- although it looks like a Unix -- file name. ::= (|_)*14 / -- for example: "COSINE-MHS/" -- This is case-sensitive, but it -- is suggested that capital letters -- are used for a directory whenever -- possible. A special directory -- name "Cmds" (and it's -- subdirectories) are reserved for Kaittola Expires: January, 1994 [Page 11] DRAFT File distribution - dialog DRAFT -- commands. It is explicitly -- forbidden to store any (normal) -- files there. ::= (|_)*14 [ . ( | _ )1*14 ] -- For example: "mapping-1" or -- "a_long_file.extension". This is -- case-sensitive, but small -- letters are suggested for a file -- name whenever possible. ::= VERSION: * ::= 6*6 - 6*6 -- This must be YYMMDD-hhmmss. -- No timezone is specified. It is -- assumed to be local to the -- node that issues the version -- number. ::= FTP: * : * ; * * ; * -- FTP is one more option for getting -- a file. This line doesn't have to -- be present. Even if it is present -- also normal method (mail) must be -- available. ::= ( ( . )1* ) | <[> 1*3 ( . 1*3 )3*3 <]> -- Look at [RFC-952] and -- [RFC-1123] to see the source of -- this definition. ::= ( | ) 1* ::= -- The name of the file to be get 1* -- using FTP. Even a semicolon may -- be part of it. ::= -- Username to be given. 1* ::= -- Password to be given. 1* -- there are two reserved passwords: -- "user@domain" is intended for Kaittola Expires: January, 1994 [Page 12] DRAFT File distribution - dialog DRAFT -- those anonymous FTP servers that -- require a password consisting of -- real username and domainname to -- be used. It should be replased -- with actual data. "SECRET" means -- that password has to be known by -- some other means. ::= IAM: * ::= | | ( * ) -- Also some other kind of address -- could be used. However, it must -- not start or end either with "//" -- nor with "<>". -- Listing both RFC-address and OR -- address might make sense as the -- sender doesn't always know -- which one is more appropriate -- form. ::= -- Normal, valid RFC-822 address -- enclosed in "<>" pair. ::= -- Normal valid OR address using -- "/ notation" not enclosed in -- "<>" pair. 4.2. Sendme In SENDME message the recipient asks for one or more files or commands. A key is given in the request. It is possible to request either a specific version or the latest version. For example: SENDME: FILE COSINE-MHS/mapping-1 # VERSION is either # newest for newest available version # ihave # I have version #; send me the newest if it # isn't this (or smaller) # # send me version # VERSION: newest # Don't bother compressing before sending COMPRESSION: NONE # Data size must not be more that 60 kb in a message. Kaittola Expires: January, 1994 [Page 13] DRAFT File distribution - dialog DRAFT # If there is more data then it must be split into moltiple # messages. MAXSIZE: 60 IAM: KEY: 1234567890abcdefghij SERIAL: 123 Formal definition: ::= -- SENDME message ( )1* ::= -- SENDME line SENDME: * ( FILE | CMD ) * ::= COMPRESSION: * NONE | ( IS * ) | ( CAN * [ ; * ] ) -- In sendme-message compression can -- be either NONE or it can be a -- list of possible compressions -- (CAN alternative). In data- -- message it can be either NONE -- or the name of used compression -- (IS-alternative). The idea is -- that when a message is being -- requested it is possible to -- specify all of the possible -- compressions. The server may -- pick one of them, or it may use -- no compression at all. ::= 1*15 -- The name of the used compression. -- This can either be bilaterally -- used name or a name that is -- registered somewhere (IANA?). -- The idea is to compress the -- data so that the transport takes -- less time. A good candidate is -- GNU-zip. ::= Kaittola Expires: January, 1994 [Page 14] DRAFT File distribution - dialog DRAFT MAXSIZE: * 1* -- This is the maximum size (in kb) -- of data one message may contain. -- a is counted as two bytes. -- If it's value is zero then there -- is no maximum size set. A server -- may use smaller limit that what -- is set in here, but it must not -- send a message with data part -- bigger than the requested maximum -- size. (The actual size of the -- message is more than the size of -- data. However, the size of the -- data is a dominant factor.) ::= -- KEY line KEY: * ::= -- signature key 10*20 ::= -- SERIAL line SERIAL: * ::= -- serial number 1*10 -- Serial number consists of -- up to ten digits. It is -- intended that whenever a -- new request is sent the -- serial number is -- incremented by one. -- However, implementation -- may choose to use it -- differently. 4.3. Data The distributor sends the file(s) in a DATA message to the recipient. For example: DATA: FILE TXT COSINE-MHS/mapping-1 VERSION: 121303-170393 PATH: COMPRESSION: NONE CHECK: 123 USED PART: 1 of 3 ---------- start cosine-mhs/mapping-1 ---------- Kaittola Expires: January, 1994 [Page 15] DRAFT File distribution - dialog DRAFT ---------- end cosine-mhs/mapping-1 ---------- IAM: KEY: 1234567890abcdefghij SERIAL: 123 REPLY: + Positive Formal definition: ::= -- DATA message ( -- If this block is missing -- then reply at K-line must 1* -- be set to negative. )* ::= -- DATA line DATA: * ( FILE ( TXT | BINARY ) | CMD | LIST [RECURSIVE] ) * ::= -- PATH line PATH: * ( IGNORE | ) -- There can be more than one PATH -- line. Basically, IAM-line for -- every node that the message have -- passed through should be listed. -- New line will always be placed -- before any other PATH line. -- If there are more than one PTH -- line the last line can contain -- a keyword "IGNORE", which means -- that one or more PATH line has -- been removed. This is intended -- for situations where the actual -- path a file has taken isn't so -- important to know. ::= PART: * 1* *of *1* -- This is needed for multipart -- messages. (Multipart messages -- are needed if such a big file -- has been requested that it can't -- be sent in one message.) If -- message isn't multipart then -- PART: 1 of 1 is used. Kaittola Expires: January, 1994 [Page 16] DRAFT File distribution - dialog DRAFT ::= ---------- start ---------- -- Although spacing is defined to be -- strictly like this, when -- receiving a message the amount of -- spaces must be treated as being -- insignificant. The same is true -- for the as well. ::= -- Base64 encoded file to be -- transmitted; As Base 64 doesn't -- use a hyphen there is no danger -- for confusion. -- In addition to Base64 a Hamming -- coding can be used to calculate -- checksums for each line. The -- coding is adjusted to notice if -- lines are duplicated or lost. ::= ---------- end ---------- -- Look at the comment for the -- . ::= CHECK: * ( USED | NONE ) -- If USED then Hamming coding has -- been used to calculate a checksum -- for every line of data. ::= -- Checksum is simply the number of -- data lines (or 33+2 bytes blocks -- if Hamming coding is used). -- It is used to find out if there -- are any lines missing (or -- duplicated). If Hamming coding -- is used, this is used to detect -- blocks that are missing from the -- end. ::= -- REPLY line REPLY: * * ::= -- Status of reply +|- -- Plus or minus, positive or -- negative ::= -- A free-form explanation, can be Kaittola Expires: January, 1994 [Page 17] DRAFT File distribution - dialog DRAFT (|)* -- long if line continuation is -- used. If it contains the string -- "\n" it is to be understood as -- an end of line. On of the following explanations must be present at : - Positive. Requested file(s) are sent. - Validition failure. A file may or may not exist locally, but the recipient is not served. - File doesn't exist. The requested file is not available. - Too new version. File exists, but version being requested is too new (doesn't exist). - Version not available. Newer version exists locally, but requested version doesn't. - Incorrect request. Something else went wrong. If the requested files are sent the reply status is set to positive, othervise it is set to negative. Answer will contain the information specified above (for logging purposes) and possibly some locally defined explanation. For every request that fail a separate message is required. 4.4 List The client requests a listing of files by sending a LIST message. An example follows. # Directory LIST: COSINE-MHS/ TMP/listing # Data compression (if used) COMPRESSION: NONE # Max size of data in reply MAXSIZE: 60 # RFC-addr between a pair of "<>", OR-addr using /-notation. IAM: # Key KEY: 1234567890abcdefghij # Serial SERIAL: 123 The reply to a LIST message is not an IHAVE message. The reply is generally not intended to be used to request files, although it could be used in that way, too. Kaittola Expires: January, 1994 [Page 18] DRAFT File distribution - dialog DRAFT Formal definition ::= -- LIST message ::= LIST: 1* [1* RECURSIVE] -- Request either a list of files in -- a directory or a recursive -- listing. The result is sent back -- in . 4.5. Ping A ping message is used to test connectivity to the server. The server is expected to reply with a PONG message. For example: PING IAM: KEY: 1234567890absdefghij SERIAL: 123 Formal definition: ::= ::= PING 4.6. Pong A pong message is a reply to a ping message. For example: PONG IAM: KEY: 1234567890absdefghij SERIAL: 123 Kaittola Expires: January, 1994 [Page 19] DRAFT File distribution - dialog DRAFT GREETING: Greetings from MHS coordination server! Formal definition: ::= ::= PONG ::= GREETING: * * -- Greeting is an informal text -- that can be presented to the -- human that uses the ping client -- to test the server. -- Line continuation is to be used -- if all of the text can't be fit -- into one line. Use "\n" to -- represent a line break. 4.7. Contents of data Normally data message carries a (set of) file(s) that can be either text files or binary files. These are encoded using Base64 as defined in [RFC-1341] (pages 17-19). Commands and listings are also carried as normal files. All these cases will be distinquished based on where keywords "FILE", "CMD" and "LIST" are used. A keyword "FILE" will be folloved by "TXT" or "BINARY" to describe the type of the file. A command file could look like this: SOURCE: (MHS server) APPROVED-BY: (MHS team) CREATE: cosine-mhs/mapping-notes-1 CREATE: cosine-mhs/mapping-notes-2 DELETE: cosine-mhs/mapping-notes Formally this can be defined like this: ::= 1* Kaittola Expires: January, 1994 [Page 20] DRAFT File distribution - dialog DRAFT 1* ::= SOURCE: * * <(> <)> -- This is the source of this -- command file. It includes the -- e-mail address of the source as -- well as a human readable name for -- it. ::= -- This is the human readable name 1* -- associated with an e-mail -- address. ::= APPROVED-BY: * * <(> <)> -- The syntax is very much like in -- . However, they have -- different semantic meanings. -- A tells who -- originally created the command. -- A new is created every -- time a message is checked by -- intermediate nodes. Checking (by -- human staff) is mandatory when -- a command for creating or -- deleting a file is sent. It is -- strongly suggested that checking -- will be carried out whenever -- commands are executed. Transit -- nodes can do checking but they -- don't have to. ::= CREATE | ( DELETE [ 1 RECURSIVE ] ) : ( FILE * ) | ( DIR * ) -- CREATE is used to create either -- a new file or a new directory. -- DELETE is used to delete a file -- or a directory. Unless RECURSIVE -- is specified a directory to be -- deleted must be empty. When sending a listing in a file it looks either like this (normal listing): [cosine-mhs/mapings] [FILE] mapping-1 [FILE] mapping-2 Kaittola Expires: January, 1994 [Page 21] DRAFT File distribution - dialog DRAFT [FILE] mapping-gate [DIR] old [FILE] info [DIR] tools Or it looks like this (recursive listing) [cosine-mhs/mapings] [FILE] mapping-1 [FILE] mapping-2 [FILE] mapping-gate [DIR] old [FILE] mapping-1 [FILE] mapping-2 [FILE] mapping-gate [RID] [FILE] info [DIR] tools [FILE] generate [DIR] data [FILE] countries [RID] [FILE] check [RID] As can be seen, the level of indentation is used to visually indicate which files belong to which directory. Formal definition is like this: ::= * ::= ( ) * ( <[> ( FILE | DIR ) <]> * ) | ( <[> RID <]> ) -- Spaces at the beginning of lines -- are used to denote the directory -- structure. They are not -- interpreted while parsing a -- message but used to help a human -- reader understand the structure. -- A directory is started with a -- [DIR] and closed with a [RID]. -- A sophisticated user agent could -- hide [RID] lines from a human, as -- they are only intended to ease -- the parsing. [RID]-lines are not -- needed for non-recursive message. Kaittola Expires: January, 1994 [Page 22] DRAFT File distribution - dialog DRAFT 5. Coding and checksums When data is transmitted it is always encoded, either using normal Base64 as defined in [RFC-1341], or using modified Base64. When using the normal Base64 a line lenght of 76 is to be used when sending a message. However, when receiving a message an arbitrary (and possibly varying) line lenght is to be accepted. The modified Base64 is actually a mixture of Base64 and a Hamming code. The Hamming coding is used for error detection. It could be used for error correction as well, but it is expected that errors are rare but severe, so when an error is found the data is requested to be sent again. 5.1. Some background Using only the normal Base64 is defined well enough in [RFC-1341]. The checksum code used is a (91,88) base-9 Hamming code. This means that three base-9 symbols are generated for every 88 base-9 symbols. One base-9 symbol can fully represent three bits, thus 33 bytes are processed to generate the checksum. Coding those 33 bytes to Base64 symbols makes 44 symbols. Checksum (3 base-9 symbols) is represented with 2 Base64 symbols, thus encoding 33 bytes of original data totals up to a block of 46 Base64 symbols. If the checksum is used a line length of 46 is to be used when encoding data. As usual, any line lenght is to be accepted when decoding data. All arithmetics are performed in remainder class 9. An analysis of reminder class 9 is not provided here, but for example 2 * 7 = (14 modulo 9) = 5 8 + 1 = (9 modulo 9) = 0 The input stream is processed 33 bytes at a time. If there are less than 33 bytes left to process the block is zero-padded for the checksum calculation. This padding is internal to the checksum calculation. 5.2. Calculating the checksum When calculating the checksum the data is handled in Kaittola Expires: January, 1994 [Page 23] DRAFT File distribution - dialog DRAFT three-byte (or 24 bits) groups, like this: Bytes 000000001111111122222222 Base64 000000111111222222333333 Base-9 777666555444333222111000 Three bytes can be split into four Base64 or eight three-bit groups that are interprented as base-9 symbols. 33 bytes generate 11 3-byte groups, 44 base64 symbols and 88 base-9 symbols. The checksum calculation is a simple matrix multiplication. The 88-element base-9 vector is multiplicated with a 3x88-element matrix and the result is a 3-element vector. Because all calculations are done in the remainder class 9, only values from zero to eight are present in the matrix and the results will also be from zero to eight. The reversed order of the base-9 symbols is selected to help to develop as efficient implementation as possible. To generate a checksum vector c one multiplies data vector d with a generator matrix G this way: c = d * G The checksum vector c is then converted into two Base64 symbols as follows: Checksum 000011112222 (byte positions) Base64 000000111111 Four bits are needed to represent one base-9 symbol. Three base-9 symbols total up to 12 bits, which can be represented in two Base64 symbols. The generator matrix G is: Rows 1-22 Rows 23-44 Rows 45-66 Rows 67-88 0 1 1 1 1 4 1 3 8 1 6 3 0 1 2 1 1 5 1 4 0 1 6 4 0 1 3 1 1 6 1 4 1 1 6 5 0 1 4 1 1 7 1 4 2 1 6 6 0 1 5 1 1 8 1 4 3 1 6 7 0 1 6 1 2 0 1 4 4 1 6 8 0 1 7 1 2 1 1 4 5 1 7 0 0 1 8 1 2 2 1 4 6 1 7 1 0 3 1 1 2 3 1 4 7 1 7 2 0 3 2 1 2 4 1 4 8 1 7 3 1 0 1 1 2 5 1 5 0 1 7 4 Kaittola Expires: January, 1994 [Page 24] DRAFT File distribution - dialog DRAFT 1 0 2 1 2 6 1 5 1 1 7 5 1 0 3 1 2 7 1 5 2 1 7 6 1 0 4 1 2 8 1 5 3 1 7 7 1 0 5 1 3 0 1 5 4 1 7 8 1 0 6 1 3 1 1 5 5 1 8 0 1 0 7 1 3 2 1 5 6 1 8 1 1 0 8 1 3 3 1 5 7 1 8 2 1 1 0 1 3 4 1 5 8 1 8 3 1 1 1 1 3 5 1 6 0 1 8 4 1 1 2 1 3 6 1 6 1 1 8 5 1 1 3 1 3 7 1 6 2 1 8 6 However, calculating the checksum in this way is only the first stage: at a second stage the checksum of the previous line (or a zero-vector if the first line is being processed) is added to the checksum vector. This addition is done in reminder class 9 as a vector operation, so there won't be any problems with an over-flow. The that is present in the is the number of the 33 bytes blocks, or (if no Hamming coding is used) the number of Base64 lines. Kaittola Expires: January, 1994 [Page 25] DRAFT File distribution - dialog DRAFT 6. Security consideratins Security is based on keys that are sent with the request and copied in a reply. This gives a protection against forged messages. It doesn't work if an ethernet is tapped or some relaying machine is cracked. However, this is considered to be such an extreme situation that such a cracker could in any case cause a great deal of trouble. References [RFC-952] RFC 952 (DOD Internet Host Table Specification) [RFC-1123] RFC 1123 (Requirements for Internet Hosts -- Application and Support) [RFC-1341] RFC 1341 (MIME (Multipurpose Internet Mail Extensions)) [DST-STRUCTURE] Mail based file distribution - Part 2: Over-all structure [DST-MHS] One more companion document to be written (informational) Acknowledgements Various peoples have contributed on this document. It is impossible to list everyone here. However, I'd like to give special thanks to the following: Urs Eppenberger, Allan Cargille, Harald Tveit Alvestrand, Paul Andre Pays and Jim Romaquera from RARE WG1 / RARE WG-MSG / COSINE MHS managers / IETF X.400 ops have contributed and kept me going. Pasi Ojala wrote the first implementation while I wrote this paper. He also suggested many improvements. He developed the approach used in Hamming coding. Keijo Ruohonen helped with the Hamming coding. Author's address Marko Kaittola FUNET c/o Tampere University of Technology Software Systems Laboratory P.O. Box 553 SF-33101 Tampere Finland Kaittola Expires: January, 1994 [Page 26] DRAFT File distribution - dialog DRAFT E-mail: Marko.Kaittola@funet.fi G=Marko; S=Kaittola; O=funet; A=fumail; C=fi; Kaittola Expires: January, 1994 [Page 27] DRAFT File distribution - dialog DRAFT Appendix A: BNF summary - initial parsing ::= # * -- A line starting with a "#" is -- a comment. ::= 1* ( \ 1* 1* )1* -- Note that there are no -- insignificant spaces before -- the backslash. If there is a space -- it is significant. Spaces in front -- of the continuation line(s) are -- insignificant and will be removed -- while putting the physical lines -- together. Kaittola Expires: January, 1994 [Page 28] DRAFT File distribution - dialog DRAFT Appendix B: BNF summary - message structure ::= | | ( * ) -- Also some other kind of address -- could be used. However, it must -- not start or end either with "//" -- nor with "<>". -- Listing both RFC-address and OR -- address might make sense as the -- sender doesn't always know -- which one is more appropriate -- form. ::= CHECK: * ( USED | NONE ) -- If USED then Hamming coding has -- been used to calculate a checksum -- for every line of data. ::= -- Checksum is simply the number of -- data lines (or 33+2 bytes blocks -- if Hamming coding is used). -- It is used to find out if there -- are any lines missing (or -- duplicated). If Hamming coding -- is used, this is used to detect -- blocks that are missing from the -- end. ::= COMPRESSION: * NONE | ( IS * ) | ( CAN * [ ; * ] ) -- In sendme-message compression can -- be either NONE or it can be a -- list of possible compressions -- (CAN alternative). In data- -- message it can be either NONE -- or the name of used compression -- (IS-alternative). The idea is -- that when a message is being -- requested it is possible to -- specify all of the possible -- compressions. The server may -- pick one of them, or it may use -- no compression at all. Kaittola Expires: January, 1994 [Page 29] DRAFT File distribution - dialog DRAFT ::= 1*15 -- The name of the used compression. -- This can either be bilaterally -- used name or a name that is -- registered somewhere (IANA?). -- The idea is to compress the -- data so that the transport takes -- less time. A good candidate is -- GNU-zip. ::= -- DATA line DATA: * ( FILE ( TXT | BINARY ) | CMD | LIST [RECURSIVE] ) * ::= -- DATA message ( -- If this block is missing -- then reply at K-line must 1* -- be set to negative. )* ::= (|_)*14 / -- for example: "COSINE-MHS/" -- This is case-sensitive, but it -- is suggested that capital letters -- are used for a directory whenever -- possible. A special directory -- name "Cmds" (and it's -- subdirectories) are reserved for -- commands. It is explicitly -- forbidden to store any (normal) -- files there. ::= -- SERIAL line SERIAL: * ::= ---------- end ---------- -- Look at the comment for the -- . ::= -- A free-form explanation, can be (|)* -- long if line continuation is -- used. If it contains the string Kaittola Expires: January, 1994 [Page 30] DRAFT File distribution - dialog DRAFT -- "\n" it is to be understood as -- an end of line. ::= -- Base64 encoded file to be -- transmitted; As Base 64 doesn't -- use a hyphen there is no danger -- for confusion. -- In addition to Base64 a Hamming -- coding can be used to calculate -- checksums for each line. The -- coding is adjusted to notice if -- lines are duplicated or lost. ::= * -- File name is a symbolic name, -- although it looks like a Unix -- file name. ::= (|_)*14 [ . ( | _ )1*14 ] -- For example: "mapping-1" or -- "a_long_file.extension". This is -- case-sensitive, but small -- letters are suggested for a file -- name whenever possible. ::= -- The name of the file to be get 1* -- using FTP. Even a semicolon may -- be part of it. ::= FTP: * : * ; * * ; * -- FTP is one more option for getting -- a file. This line doesn't have to -- be present. Even if it is present -- also normal method (mail) must be -- available. ::= GREETING: * * -- Greeting is an informal text -- that can be presented to the -- human that uses the ping client -- to test the server. -- Line continuation is to be used -- if all of the text can't be fit -- into one line. Use "\n" to -- represent a line break. Kaittola Expires: January, 1994 [Page 31] DRAFT File distribution - dialog DRAFT ::= ( ( . )1* ) | <[> 1*3 ( . 1*3 )3*3 <]> -- Look at [RFC-952] and -- [RFC-1123] to see the source of -- this definition. ::= ( | ) 1* ::= -- Commands are given in a special -- file IHAVE: * ( FILE ( TXT | BINARY ) | CMD ) * ::= -- IHAVE message ( [] )1* ::= IAM: * ::= -- KEY line KEY: * ::= -- signature key 10*20 ::= LIST: 1* [1* RECURSIVE] -- Request either a list of files in -- a directory or a recursive -- listing. The result is sent back -- in . ::= -- LIST message ::= MAXSIZE: * 1* -- This is the maximum size (in kb) -- of data one message may contain. -- a is counted as two bytes. -- If it's value is zero then there Kaittola Expires: January, 1994 [Page 32] DRAFT File distribution - dialog DRAFT -- is no maximum size set. A server -- may use smaller limit that what -- is set in here, but it must not -- send a message with data part -- bigger than the requested maximum -- size. (The actual size of the -- message is more than the size of -- data. However, the size of the -- data is a dominant factor.) ::= -- Normal valid OR address using -- "/ notation" not enclosed in -- "<>" pair. ::= -- PATH line PATH: * ( IGNORE | ) -- There can be more than one PATH -- line. Basically, IAM-line for -- every node that the message have -- passed through should be listed. -- New line will always be placed -- before any other PATH line. -- If there are more than one PTH -- line the last line can contain -- a keyword "IGNORE", which means -- that one or more PATH line has -- been removed. This is intended -- for situations where the actual -- path a file has taken isn't so -- important to know. ::= PART: * 1* *of *1* -- This is needed for multipart -- messages. (Multipart messages -- are needed if such a big file -- has been requested that it can't -- be sent in one message.) If -- message isn't multipart then -- PART: 1 of 1 is used. ::= -- Password to be given. 1* -- there are two reserved passwords: -- "user@domain" is intended for -- those anonymous FTP servers that -- require a password consisting of -- real username and domainname to -- be used. It should be replased -- with actual data. "SECRET" means -- that password has to be known by Kaittola Expires: January, 1994 [Page 33] DRAFT File distribution - dialog DRAFT -- some other means. ::= PING ::= ::= PONG ::= ::= -- REPLY line REPLY: * * ::= -- Status of reply +|- -- Plus or minus, positive or -- negative ::= -- Normal, valid RFC-822 address -- enclosed in "<>" pair. ::= -- SENDME line SENDME: * ( FILE | CMD ) * ::= -- SENDME message ( )1* ::= -- serial number 1*10 -- Serial number consists of -- up to ten digits. It is -- intended that whenever a -- new request is sent the -- serial number is -- incremented by one. -- However, implementation Kaittola Expires: January, 1994 [Page 34] DRAFT File distribution - dialog DRAFT -- may choose to use it -- differently. ::= ---------- start ---------- -- Although spacing is defined to be -- strictly like this, when -- receiving a message the amount of -- spaces must be treated as being -- insignificant. The same is true -- for the as well. ::= -- Username to be given. 1* ::= VERSION: * ::= 6*6 - 6*6 -- This must be YYMMDD-hhmmss. -- No timezone is specified. It is -- assumed to be local to the -- node that issues the version -- number. Kaittola Expires: January, 1994 [Page 35] DRAFT File distribution - dialog DRAFT Appendix C: BNF summary - data message parsing ::= APPROVED-BY: * * <(> <)> -- The syntax is very much like in -- . However, they have -- different semantic meanings. -- A tells who -- originally created the command. -- A new is created every -- time a message is checked by -- intermediate nodes. Checking (by -- human staff) is mandatory when -- a command for creating or -- deleting a file is sent. It is -- strongly suggested that checking -- will be carried out whenever -- commands are executed. Transit -- nodes can do checking but they -- don't have to. ::= 1* 1* ::= CREATE | ( DELETE [ 1 RECURSIVE ] ) : ( FILE * ) | ( DIR * ) -- CREATE is used to create either -- a new file or a new directory. -- DELETE is used to delete a file -- or a directory. Unless RECURSIVE -- is specified a directory to be -- deleted must be empty. ::= -- This is the human readable name 1* -- associated with an e-mail -- address. ::= ( ) * ( <[> ( FILE | DIR ) <]> * ) | ( <[> RID <]> ) -- Spaces at the beginning of lines -- are used to denote the directory -- structure. They are not -- interpreted while parsing a -- message but used to help a human -- reader understand the structure. Kaittola Expires: January, 1994 [Page 36] DRAFT File distribution - dialog DRAFT -- A directory is started with a -- [DIR] and closed with a [RID]. -- A sophisticated user agent could -- hide [RID] lines from a human, as -- they are only intended to ease -- the parsing. [RID]-lines are not -- needed for non-recursive message. ::= * ::= SOURCE: * * <(> <)> -- This is the source of this -- command file. It includes the -- e-mail address of the source as -- well as a human readable name for -- it. Kaittola Expires: January, 1994 [Page 37] DRAFT File distribution - dialog DRAFT Appendix D: Checksum function by Pasi Ojala long calcsum(size,buf) int size; /* number of elements, 0-11 */ long *buf; /* 24-bit values */ { int i,j,a,index=0; static int cs0=0,cs1=0,cs2=0; /* Initialize to 0 only once */ long value; /* Matrix columns */ static char a0[]={0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; static char a1[]={1,1,1,1,1,1,1,1,3,3,0,0,0,0,0,0,0,0, 1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2, 3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4, 5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6, 7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8}; static char a2[]={1,2,3,4,5,6,7,8,1,2,1,2,3,4,5,6,7,8, 0,1,2,3,4,5,6,7,8,0,1,2,3,4,5,6,7,8, 0,1,2,3,4,5,6,7,8,0,1,2,3,4,5,6,7,8, 0,1,2,3,4,5,6,7,8,0,1,2,3,4,5,6,7,8, 0,1,2,3,4,5,6,7,8,0,1,2,3,4,5,6}; for(i=0;i>= 3; /* Forget the processed 3 bits */ index++; /* Go to the next row in the matrix */ } } cs0 %= 9; /* Convert to a remainder class */ cs1 %= 9; /* These can be outside of the loop, */ cs2 %= 9; /* because 88*9*8 = 6336, which fits into */ /* integer nicely */ /* return a 12-bit checksum */ return ((cs0<<8) | (cs1<<4) | cs2); } Kaittola Expires: January, 1994 [Page 38]