CARVIEW |
NOTE-SDML-19980619
SDML - Signed Document Markup Language
W3C Note 19-June-1998
- This version:
- https://www.w3.org/TR/1998/NOTE-SDML-19980619
- Latest version:
- https://www.w3.org/TR/NOTE-SDML
- Authors:
- Jeff Kravitz
Copyright © 1996, 1997, 1998 Financial Services Technology Consortium. All rights reserved.
Status of this document
This document is a submission to the World Wide Web Consortium. It is the initial draft of the specification of SDML. It is intended for review and comment by W3C members and is subject to change. There are W3C Staff comments on this submission.
This document is a NOTE made available by the W3 Consortium for discussion only. This indicates no endorsement of its content, nor that the Consortium has, is, or will be allocating any resources to the issues addressed by the NOTE.
Contents
- 1 Introduction
- 2 Notation
- 3 Document Formatting Rules
- 4 SDML Document Definition
- 5 Document Structure
- 6 Combining Documents
- 7 ASN.1 Definition of X.509 Version 1 Certificates
- 8 Field Summary
- 9 Verifying Certificates
- 10 Bibliography
- 11 Issues and Directions
- Appendix A - SGML Document Type Definition (DTD)
- Appendix B - Definitions
- Appendix C - Acknowledgments
1 Introduction
A child of five would understand this. Send someone to fetch a child of five.
- Groucho Marx
Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.
- John von Neumann
Make everything as simple as possible, but not simpler.
- Albert Einstein
Research is what I'm doing when I don't know what I'm doing.
- Wernher Von Braun
PROGRAM: n. A magic spell cast over a computer allowing it to turn one's input into error messages. tr.v. To engage in a pastime similar to banging one's head against a wall, but with fewer opportunities for reward.
Then anyone who leaves behind him a written manual, and likewise anyone who receives it, in the belief that such writing will be clear and certain, must be exceedingly simple-minded.
- Plato, Phaedrus 275d
Read over your compositions, and where ever you meet with a passage which you think is particularly fine, strike it out.
- Samuel Johnson, quoting a college tutor, 1773
The knowledge of Cyphering, hath drawn on with it a knowledge relative unto it, which is the knowledge of Discyphering, or of Discreting Cyphers ... Certainly it is an Art which requires great pains and a good wit, and is (as the other wits) consecrate to the Counsels of Princes.
- Sir Francis Bacon, 1623
1.1 Background
The Signed Document Markup Language (SDML) was developed by the Financial Services Technology Consortium (FSTC) as part of the Electronic Check Project. SDML is designed to:
- tag the individual text items making up a document,
- group the text items into document parts which can have business meaning and can be signed individually or together,
- allow document parts to be added and deleted without invalidating previous signatures, and
- allow signing, co-signing, endorsing, co-endorsing, and witnessing operations on documents and document parts.
The signatures become part of the SDML document and can be verified by subsequent recipients as the document travels through the business process. SDML does not define encryption, since encryption is between each sender and receiver in the business process and can differ for each link depending on the transport used.
SDML is the generic document structuring and signing part of the Financial Services Markup Language (FSML). FSML defines the specific document parts needed for electronic checks, the tags which identify check-specific data items, the semantics of the data items, and processing requirements for electronic checks. FSTC will be releasing the FSML specification implemented in the U.S. Treasury pilot along with a proposal for version 2.0 of FSML in the near future.
When development of FSML began in 1995, HTML was in its early stages of widespread deployment. SGML had been standardized some years earlier, software tools were readily available, and the use of tagged, readable text was attractive for its simplicity, ease of understanding, operational support, and ease of development and use. FSML/SDML were designed so that they could be defined using an SGML Document Type Definitions (see Appendix A). FSML also defined document formatting rules so that readable text electronic checks could be sent via electronic mail systems without the risk that the mail systems would modify the electronic check in ways that invalidate the signatures.
SDML is being published now to inform industry associations and standards bodies about the FSTC's experience, to show how cryptographic signatures can be embedded in structured, tagged-text documents, and to show how the business requirements of a typical application can be met. Given the standardization of XML and the widespread use of MIME attachments for sending documents with 8-bit transparency, FSTC wants to engage in discussions around how SDML and XML inter-relate and how these two approaches, which had different initial objectives, can be brought closer together and made compatible and as consistent as possible.
FSML has been implemented by the Electronic Check Project. A pilot implementation is in operation using payer and payee software to send, receive, and deposit electronic checks over the Internet. Cryptographic hardware, in the form of smart cards, has been developed to contain the private signing keys, perform the hashing and signing operations, and to perform other "electronic checkbook" functions, such as automatically numbering and logging checks written or deposited. Advice of payments are attached cryptographically to the checks when sent between payer and payee, and they are removed by the payee. Similarly, checks are attached cryptographically to deposit slips when they are sent to the payee's bank. Bank server systems have been developed to process the electronic checks, to interface with existing check processing systems in the banks, and to clear and settle electronic checks between banks. A Certificate Authority hierarchy has been established, and certificates have been issued to banks and checking account holders.
The Electronic Check project, from its inception, sought to develop a general solution to the issues of authentication and integrity associated with creating electronic financial instruments. The technical and business problems of implementing electronic check payments between payers and payees over the Internet provided a practical context for developing the solution.
Paper checks have a rich tradition, and support numerous options, check types, attached information, and sophisticated processing. However, paper checks are fundamentally a "signed writing directing a bank to pay money, after a date, from the payer's account." The Electronic Check project determined that the essence of the problem to be solved was to develop a generalized structure for creating, processing, and displaying electronic "signed writings," where cryptographic signatures would substitute for manual signatures and where an electronic message would take the place of the paper medium. The structure would need to support the same business operations as signed paper checks, such as signing, co-signing, and witnessing of signatures, and attaching and removing associated documents such as remittance slips, invoices, advice of payment, and deposit slips.
Since checks are a form of negotiable instruments, and negotiable instruments are a form of contracts, it is believed that SDML may be used to create signed documents suitable for a wide variety of purposes. For example, they may be used as messages to initiate electronic funds transfer, as orders and invoices needed for electronic commerce, or for other forms of signed contracts or agreements.
SDML documents, which are hashed and cryptographically signed using public key signature algorithms, can have the following security attributes:
- Verifiability of Origin --A document recipient can authenticate that the document was created by a specific person or institution, and that the signature was not forged or created by an impostor.
- Integrity -- A document recipient can determine that the document has not been altered in any way since it was signed.
- Accountability -- A document recipient can prove to a third party that the document was created by the signer even if the signer repudiates the document, unless the signer can also establish that someone else has possession or control of his private cryptographic signing key.
The SDML signature mechanism allows documents to be combined, or added to, without loss of these attributes with respect to prior signatures or the pre-existing parts of the documents.
The Financial Services Technology Consortium (FSTC) is a not-for-profit organization whose goal is to enhance the competitiveness of the United States financial services industry. Members of the consortium include banks, financial services providers, research laboratories, universities, technology companies, and government agencies.
1.2 Business Objectives
Some of the business objectives that were instrumental in the design of SDML were...
- To develop a general method for creating and verifying business documents at the
application level with integrated digital signatures;
- To provide assurance to the application and the customer;
- To eliminate the need for paper source documents;
- To allow incorporation into a number of different business applications;
- To support peer-to-peer exchanges;
- To keep, indeed to require, the signatures integrated with the documents;
- To identify the individual or organization which created the documents;
- To identify the individuals or organizations which process the documents;
- To support signing, cosigning, counter signing, endorsing, and "initialing";
- To be vendor-neutral;
- To be cryptographic algorithm-neutral;
- To provide a high degree of flexibility in the information content and structure of the
signed documents;
- To support a wide range of different documents;
- To allow for efficient processing;
- To clearly define the scope of a signature such that different signers could sign different parts of the same document, or indeed, where needed, could sign someone else's signature;
- To provide for the attachment and removal of auxiliary documents while still maintaining
the digital signature integrity;
- To allow binding information together;
- To allow removal of information when it reaches its intended recipient to protect privacy and improve efficiency of later handling;
- To be usable through any electronic transport media;
- To enable an end recipient who does not expect a signed document to understand what he or she received;
- To be independent of the network or electronic communication system used;
- To work through online connections, such as provided by the WWW or secure sockets;
- To work through any electronic mail system, including "plain old text e- mail;"
- To work through legacy communications and computer systems;
- To allow documents to be as self-contained as possible;
- To recognize and support different documents which have different security and time duration and immediacy requirements;
- To simplify the processing;
- To enable critical processing to occur in simple devices, such as smart cards, where appropriate to the application;
- To provide reasonable assurance to off-line or off-network processing;
- To minimize the reliance on third parties;
- To minimize the need for third-party directories;
- To minimize the need for security databases at the end users;
- To simplify auditing and research;
- To provide control to the transacting parties;
- By providing flexibility in managing transaction risk as needed;
- By allowing direct peer-to-peer exchanges;
- By minimizing systemic risk to transacting parties by supporting inexpensive secure processing devices such as smart cards;
- By enabling document semantics to control issues such as transaction-effective dating and validity intervals;
- By enabling third-party services, such as time stamping, electronic postage, third party processing, or archival storage as needed by the transacting parties;
- By allowing separate control of the privacy requirements;
1.3 Technical Decisions
The above business objectives led to a number of technical decisions, based on much thought and discussion amongst the team members. Some of these technical decisions were:
- An SGML-derivative language was chosen because it was an ISO standard, and had the additional benefit of using standard "printable" ASCII text. Using a binary encoding method (e.g. ASN.1) would have meant that all SDML documents would require special decoding software just to be viewed or even detected. This seemed too much of a burden since the uses and platforms envisioned would have been too many and diverse for such viewing software to become easily available. Using an ASCII text-based language meant that a casual user could still look at and identify an SDML document, with only normal e-mail, Web, or other ASCII-text viewing capabilities.
- It was decided to subset many of the SGML features (e.g., limited use of SGML end-tags) to allow SDML documents to be more easily processed by devices and systems of lower complexity. It was envisioned that some of the processing may be done on "electronic tokens" meaning PDAs, PCMCIA cards, or even smart cards, with limited memory and processing capabilities.
- Formatting and encoding rules that are not part of the ISO SGML standard were added to allow SDML documents to pass, uncorrupted, through a variety of transport mechanisms, especially e-mail, and still allow the signatures to be valid. This decision has been very useful, and appears to be one of the key items in SDML.
- It was decided to keep SDML documents independent of the transport mechanism used to transfer them from party to party. Thus, they can be transported by a large variety of mechanisms, e.g., e-mail, Web (HTTP), FTP, or even as files on a diskette (sneaker-net).
- Although privacy is a very important requirement, it was decided not to incorporate a privacy encoding scheme into the SDML definition, as a number of privacy encoding methods and systems were being defined by others, and it was felt that this decision could best be left in the hands of the other groups who were doing that for the various transport mechanisms, e.g., S-MIME, PGP, etc.
- SDML is designed to be extensible. Additional document types and blocks (sub-elements) may be added for specific applications (The FSTC E-check project uses this feature extensively). In addition, SDML, as defined in this document, allows attachment and signing of virtually any kind of attached document or file, with no additional SDML changes or extensions required.
- The basic document signature mechanism was designed to allow attachments to be added to and detached from an SDML document, and for additional signatures to be added after original document was created. Thus, the signature mechanism was though of as a kind of "electronic staple."
- All the signature verification information (except for the public key of the root of the certificate hierarchy) is in the SDML document itself. Thus, any document recipient can perform a full cryptographic signature verification on the document without the need for access to external networks, directories, servers or databases. The only external information needed is the root public key, which can be easily distributed widely.
- It was decided to use X.509 certificates, even though they do not fit in well with other SDML design goals (e.g. use of binary ASN.1 instead of human-readable ASCII) because they have become a widely adopted standard for public-key certificates. Note, however, that the SDML language allows for specification of other certificate types.
- The signature algorithms chosen are the most popular at the time this specification was designed (e.g. RSA/MD5, DSA/SHA-1); however, the SDML language allows specification of other algorithms which may become available in the future.
1.4 Structure
Below is an example of a signed electronic document:
<sdml-doc docname="doc87" type="sample"> <action> <blkname>act1 <crit>true <vers>1.0 <function>sample <reason>process </action> <attachment> <blkname>att0123 <adata encoding="text"> This is a sample attachment </adata> </attachment> <signature> <blkname>sig7 <crit>true <vers>1.0 <sigdata> <blockref>act1 <hash alg="sha">278B7F348EECE3822A48C4D197FD5B920001C2E8 <blockref>att0123 <hash alg="sha">BC59D2FE5566F506910C5020B628E4136E1C6B39 <nonce>9D9BC5AA75 <sigref>cert-111111111-00000001 <algorithm>sha/dsa <location>us </sigdata> <sig> 2489E1E376F5CD823274010B0A6028EA3F2763F2:290B95F8F02CF6616B9C3A03DF0B50295A1 62295 </signature> <cert> <blkname>cert-111111111-00000001 <crit>true <vers>1.0 <certtype>x509v1 <certissuer>/C=US/ST=NY/O=FSTC/OU=NYCA/ <certserial>1 <certdata> 308201F0308201B0020101300906072A8648CE380403303D310B300906035504061302555331 0B3009060355040813024D44310E300C060355040A130542414E4B413111300F060355040B13 08636865636B696E67301E170D3937303431343031353930305A170D39373130313130313539 30305A3050310B3009060355040613025553310B3009060355040813024D44310E300C060355 040A130542414E4B413111300F060355040B1308636865636B696E673111300F060355040313 0831313131303041313081EE3081A606072A8648CE38040130819A02408DF2A494492276AA3D 25759BB06869CBEAC0D83AFB8D0CF7CBB8324F0D7882E5D0762FC5B7210EAFC2E9ADAC32AB7A AC49693DFBF83724C2EC0736EE31C802910214C773218C737EC8EE993B4F2DED30F48EDACE91 5F0240626D027839EA0A13413163A55B4CB500299D5522956CEFCB3BFF10F399CE2C2E71CB9D E5FA24BABF58E5B79521925C9CC42E9F6F464B088CC572AF53E6D7880203430002406BEFFB5F 20C39FFEE2E9574599EB606BCE9910086F9FF7F22FAA722E7CB028F1D2FADBCD7FBF4FD16AFF C90D4BA82B2819FBA754873F2F49B6FC8C89528AE591300906072A8648CE380403032F00302C 0214524F78883CAFC10AA73BED1DDC24F2F4E821042902141C89356DBE7A77755F101E750310 910CE325DB7E </cert> <cert> <blkname>cert-111111111 <crit>true <vers>1.0 <certtype>x509v1 <certissuer>/C=US/O=FSTC/OU=FSTC CA/ <certserial>1 <certdata> 308201DF3082019F020101300906072A8648CE380403303F310B300906035504061302555331 0B3009060355040813024D44310D300B060355040A13044653544331143012060355040B130B 636865636B696E67204341301E170D3937303431343031353130305A170D3937313031313031 353130305A303D310B3009060355040613025553310B3009060355040813024D44310E300C06 0355040A130542414E4B413111300F060355040B1308636865636B696E673081EE3081A60607 2A8648CE38040130819A02408DF2A494492276AA3D25759BB06869CBEAC0D83AFB8D0CF7CBB8 324F0D7882E5D0762FC5B7210EAFC2E9ADAC32AB7AAC49693DFBF83724C2EC0736EE31C80291 0214C773218C737EC8EE993B4F2DED30F48EDACE915F0240626D027839EA0A13413163A55B4C B500299D5522956CEFCB3BFF10F399CE2C2E71CB9DE5FA24BABF58E5B79521925C9CC42E9F6F 464B088CC572AF53E6D788020343000240732584AF3193B908F41D3AA68680421C6661EA259D 4146D48F2258669A189463375D4C38040278903992D18A2ECA9B66F4046962BA7A26433CA662 12314BDBEB300906072A8648CE380403032F00302C021456A0A84E997EB0772DD592753338E9 D65B0795750214269AAE91801D9E80B8004A89225E27915044EA40 </cert> </sdml-doc> |
The above sample consists of a simple attachment (containing merely the character string "This is a sample attachment"), which is then signed. It also contains an <action> block which indicates the action to be performed when the document is received, a <signature> block, which contains the digital signature of the relevant blocks in the document, and two <cert> blocks, which contain the certificates (and thus the public keys) of the document signer, and the issuer of the public key for that signer. These blocks will be described in detail below.
An SDML electronic document is comprised of a number of blocks. as defined in the SDML definitions below. Each block contains some common fields, or elements in SGML terminology, and also contains fields that are specific to the type of block. Blocks are not nested; however, they are contained in <sdml-doc> elements, which can be nested.
All blocks that must be protected from tampering and all blocks that must be authenticated are signed using a digital signature, which is contained in a signature block. The digital signature uses one of the standard digital signature algorithms, such as MD5/RSA or SHA/DSS, although the use of MD5 is deprecated. Each signature requires a public key, which also requires a certificate. Certificates are currently distributed as X.509 certificates.
Blocks may also be "bound" together by the signature block, which contains the block names of the blocks being bound, the digital hashes of these blocks, and a digital signature on these hashes along with the other contents of the signature block. This binding allows the receiving software to verify that all blocks that were bound are present and have not been tampered with.
The concept of the SDML electronic document is that it is a flexible structure. Separating signatures, certificates, actual data, etc., into separate blocks allows a rich, complex document to be built from these "primitives," while retaining a standard format which can be parsed and verified according to a standard syntax definition.
2 Notation
In the pseudo-SGML definitions below, an attempt is being made to show examples of the SGML format for SDML electronic documents, rather than to use formal meta-linguistic notations to define them. A more accurate definition is later in the document using more formal notations (Extended BNF and SGML DTD).
In these definitions, the following simple notations are used to indicate the type of value being used for a particular field:
ccccccccc | This is used to represent a character string. The number of c's in the definition does not indicate the allowed size of the string. Character strings may contain any legal SGML character, except the tag delimiters (< or >) and the other SGML formatting characters. These characters may be inserted into the string using the standard SGML escaping sequences. Country or language-specific characters may also be used, again using the standard SGML escape sequences. Quote symbols have no special significance, and if contained in a value string, will be considered part of the value. e.g., "John Smith" is not the same value as John Smith All self-defining constants, such as ISO country and currency names, SGML tag parameters that are choices from among a list, or items specified as choices in the definitions below, must be specified in lower case. Mixed case may be used for variable strings, such as names, addresses, document names, block names, etc. In all strings, mixed-case significance is honored, i.e. the string John Smith is not equal to the string john smith. |
nnnnn | An ASCII character string used to denote an integer number, containing only the digits 0 - 9. The number of n's does not indicate the size of the number string. This is described elsewhere. |
nnnnn.nn | An ASCII character string used to denote a decimal (real) number, containing only the digits 0 - 9 and a single, optional decimal point. |
hhhhhh | An ASCII character string used to denote the hexadecimal encoding of a binary string of octets. It may only contain the ASCII characters 0-9, A-F, and a-f. The number of h's does not indicate the size of the string. All legal hexadecimal strings must consist of an even number of hex digits. In certain cases, described when used below, the field is split into two portions using a colon ":", e.g. 123456789:abcdef. |
other | A string depicted in boldface as a value represents itself. All such self-defining-constants must be in lowercase in the SDML document. |
3 Document Formatting Rules
In order for the SDML electronic document to be easily transmitted by a variety of methods (e-mail, file transfer, storage media, etc.) it was designed to be a plain ASCII document. However, certain formatting rules must be adhered to in order to ensure that most of the usual transport mechanisms, in particular e-mail systems, will successfully transport the SDML electronic document unchanged.
Note that these rules supersede the white space and line end rules for SGML, in order to ensure that signed documents can be successfully verified. Therefore, the following document formatting rules are considered mandatory for document generators and receivers:
3.1 Character Encoding
- All characters in the document should be chosen from the ASCII subset known as printable characters (ASCII values 0x20-0x7E), except for line-ends (see below). Any characters found in the document that are not in this set (i.e., 0x00-0x09,0x0B,0x0C,0x0E-0x1F,0x7F-0xFF) are disallowed. Programs that create SDML documents should not allow them to be inserted into a document, and programs that receive SDML documents should reject documents that contain them.
- If a character is required that is not displayable as one of the above ASCII printable
characters, it should be encoded in the document using an SGML
entity name for the character, enclosed between an ampersand and a semicolon, e.g., &circumflex;.
This rule applies in particular to the > and < characters, which can only be used as SGML tag delimiters. If they are to be used in normal text, they must be encoded as > and <
Note that these SGML entities will not be translated during the processing and cryptographic hashing of an SDML electronic document. They are only used for the purpose of display or printing of characters not in the usual ASCII subset.
3.2 Line Formatting
- A line-end sequence usually consists of either a Carriage Return (ASCII 0x0D) or Line-Feed (ASCII 0x0A) or both, however this is operating system-specific.
- All lines in the document must be less than or equal to 76 characters in length. When it
is necessary to create a line which will continue past 76 characters, a line-end sequence
may be inserted into the line. (This is to protect the SDML document from e-mail systems
which break up lines longer than 76 characters.)
A line-end sequence must not be inserted, either during original document creation, or when attempting to modify a document to protect against e-mail systems, if any of the following would become true:
- It would create a line consisting of a single period ('.'). (this is to protect SDML documents from e-mail systems which cease processing documents past a line consisting of a single period)
- It would create a line consisting of the string 'From ', starting in the first position of a line. (this is to protect the line from certain e-mail systems which change such lines)
- It is inserted after spaces, causing those spaces to be trailing spaces on a line.
- It causes a line-end sequence to be inserted into a tag, either between the < and the tag name, or between the tag name and the >, or inside the tag name.
- In SDML document receiving programs, all line-end sequences will be removed before any processing is done on the SDML electronic document. In particular, they are not included in hash calculations, nor passed in field values to application programs. This line-end removal occurs for the SDML tags and values.
3.3 Space Handling
- Any spaces at the end of a line, i.e., trailing spaces, will be removed prior to processing the document. This means that spaces between the last non-space character on a line, and the line-end sequence for the line are not included in hash calculations and are not passed to applications as part of the field value. This is also done inside <adata> sub-blocks inside attached documents. (this is to protect SDML documents from e-mail systems that may truncate lines with trailing spaces, or may add trailing spaces to lines).
- Embedded spaces, i.e., spaces that are not immediately before a line-end are to be processed as is, i.e., they are not to be removed before processing, hashing, etc. The same is true for leading spaces on a line, i.e., spaces following a line-end sequence.
- Leading spaces, e.g., indentation, are allowed but not recommended, as leading spaces
are not deleted from processing, and as line-ends are removed, the leading spaces on a
line will be indistinguishable from the data ending the previous line, and thus may cause
field length violations or field data misinterpretation. For example:
<tag1>abcde <tag2>defghi
would cause the value of tag1 to be `abcde '. This can be used, however, to place trailing spaces into a value, i.e., if the value if tag1 were desired to contain the trailing spaces.
- White space (either spaces or line-end sequences) may not be inserted into tags, either between the < and the tag name, or between the tag name and >, or inside the tag name. If a tag has a parameter, white space may not be inserted between the parameter name and the = or between the = and the parameter value, or inside the parameter name, or between the last character of the parameter value and the >. Multiple parameters must be separated by one or more spaces. Parameter values must be contained within quotes.
3.4 Tags
In order to allow SDML processing using limited resources, (for example, in a Smart Card using for signing) SDML requires that certain SGML features for tag handling not be allowed.
- End tags (e.g. </tag>) are not allowed except for </sdml-doc>, block-end tags (defined below), and specific sub-block end tags as defined below. All fields that do not have end-tags specified in the block and field definitions below must not have end-tags.
- SGML tag abbreviation is not supported.
4 SDML Document Definition
4.1 Electronic Document Definition
The definition begins with an SDML electronic document.
Every SDML electronic document consists of one or more enclosed documents. These documents are nested, with the nesting done by enclosing earlier forms of a document inside later additions to the document. Each enclosed document is built inside a <sdml-doc> tag structure. Inside a document are one or more blocks. Blocks may appear in any order, except that the <action> block (defined below) must the first block in the document.
<sdml-doc docname="cccccccc" type="cccccccc"> a sequence of one or more blocks and/or nested <sdml-doc> documents </sdml-doc> |
The docname= attribute parameter is a document name, assigned by the software creating the document. This name will be used when combining documents. (See Combining Documents, below). If multiple SDML documents are being created at one time, as part of one file or transmission, the creating software should ensure that the document names are unique, within the file or transmission. This name should contain a maximum of 64 characters. Note: Attribute parameters must be enclosed in quotes.
The type= attribute parameter is a document type, used to specify the type of document. This type is used by the receiving software to ensure that it has received the correct type of document, i.e., one that it knows how to process. The document types are chosen from a list of pre-defined types, or may be types agreed upon by the sending and receiving parties, except that the latter agreed-upon types may not conflict with any pre-defined types. Note: Attribute parameters must be enclosed in quotes.
To prevent such conflict between pre-defined, standardized document types, and privately agreed-upon types, all privately agreed-upon document types should be prefixed with the characters "p-" (meaning private). For example, a document type used for auto loan applications, agreed to be used by a pair or small group of cooperating banks, could be written as type="p-autoloan". All pre-defined document types will be guaranteed not to start with the characters "p-".
4.2 Block Common Field Definitions
A block contains some common fields, along with other fields specific to the type of block. Except in a few cases and unless otherwise specified, the order of fields within a block is not predefined. Once created, however, fields may not be moved or rearranged inside a block, to permit the digital signatures and hashes to be valid.
Common Block Field Definitions
Each of the blocks contains some field definitions which are common to all block types, as follows:
<blkname>ccccccc <crit>true|false <vers>nnn.nnn |
blkname | (required) This is a character string which must contain a block
name assigned at the time the document is created. The creating software must ensure that
the block names are unique within a document. The names are used to refer to the block
from other blocks. Generally, the name chosen for the block may be any unique character string. For certain blocks, a convention or rule applies when creating block names. The rules or conventions are described in the individual block descriptions. |
crit | (optional) A boolean (true/false) flag used to determine if a block is critical. If a block is critical, then the receiving software must be able to process the block. If the software cannot process a critical block, it must abort processing the entire document, or otherwise determine how to handle the document as an exceptional case. This flag is used to allow for expansion of the block types, to allow software to "ignore" block types that it doesn't recognize, providing that they are marked non-critical by the software that created them. Certain types of blocks, such as informational messages, etc. might always be considered non-critical. Other types, such as signatures, might always be considered critical. The criticality flag is assumed to have a default of true unless otherwise specified as false. Thus, it is not required to be specified in every block. |
vers | (optional) A number which indicates the version of the block. New versions may be introduced, and this number is used by receiving software to determine if it is capable of parsing/processing a block. If the version number is larger than the one understood by the receiving software, it must assume that it cannot process the block, and must use the criticality flag to determine if it can continue to process the document. If the version number is not specified, it is assumed to be 1.0. |
4.3 Block Definitions
Each SDML block starts and ends with one of the following sets of block tags:
Start Tag | End Tag |
---|---|
<action> | </action> |
<signature> | </signature> |
<cert> | </cert> |
<attachment> | </attachment> |
<message> | </message> |
The block types are defined as follows:
action | A block describing the action to be performed by the recipient |
signature | A block with the signatures and hashes of other blocks |
cert | A public key certificate |
attachment | An associated document attached to an SDML document |
message | An informational message, such as an error report |
4.3.1 Action Block Definition
This block contains information about the action to be performed by the recipient of the electronic document.
<action> <blkname>cccccccc <crit>true <vers>1.0 <function>cccccccc <reason>cccccccc </action> |
4.3.1.1 Action Block Field Definitions
function | (required) The function field contains a character string chosen from a set of commands or verbs specific to the application or type of document being sent. Each application or type of document will have a unique set of allowable functions that are supported. | ||||||||||
reason | (required) The reason field indicates the reason that the document
is being transmitted to the recipient. It must be one of the following character
strings.
|
4.3.2 Signature Block Definition
This block contains a digital signature for another block, or set of blocks. It is required whenever a block must be authenticated, or tamper-proofed. It also contains the reference to the certificate block containing the public key used to verify the signature. It is also used to "bind" multiple blocks together, so that the resulting compound document can be verified.
Unless otherwise specified, the data being signed consists of the entire contents of the subject block, which is defined to be everything between the start and end tags for the block. The signature must include the blockname, criticality, and version fields, if present, as well as the contents of the block.
The actual hashes of the signed blocks are included to allow verification of the binding even if the actual contents of the bound blocks is not available.
<signature> <blkname>cccccccc <crit>true <vers>1.0 <sigdata> <blockref>cccccccc <hash alg="sha">hhhhhh <blockref>cccccccc <hash alg="sha">hhhhhh ... <blockref>cccccccc <hash alg="sha">hhhhhh <nonce>cccccccc <sigref>cccccccc <certissuer>cccccccc <certserial>nnnnnnnn <algorithm>sha/rsa <timestamp>cccccccc <location>cccccccc <username>cccccc <useraddr>cccccc <userphone>cccccc <useremail>cccccc <useridnum>cccccc <userotherid>cccccc </sigdata> <sig>hhhhhhhh </signature> |
4.3.2.1 Signature Block Field Definitions
blockref | (required) The signature block contains one or more <blockref> fields, each of which contains the unique block name of the associated block being signed. All block references must appear immediately before their respective hashes. (See below.) The <blockref> and <hash> pairs may be repeated multiple times to sign multiple blocks. |
hash | (required) This field contains the actual hash of the respective block. Each <hash> start tag must have an attribute parameter which specifies the algorithm used to perform the hash. The currently allowed parameters are md5 or sha. The alg= attribute parameter is required. The use of md5 is deprecated. Other hash algorithms may be supported in the future. It is not required that the same hash algorithm be used for each of the blockrefs in a signature block. All hashes are encoded in "network byte order," which means that the most significant bytes are leftmost (first). Note: Attribute parameters must be enclosed in quotes. |
nonce | (required) This is a nonce, or one-time random number, used to "salt" the hashed data to discourage cryptanalysis attacks. See the section below on signature calculation. The nonce value can be any string of random ASCII characters from within the set of allowed SDML characters (see Character Encoding above), not including white space. It is therefore possible for the value to be represented as an integer (containing only the digits from 0-9), a floating point number, a hexadecimal string, or a base64-encoded string. |
sigref | (optional) This is the block name of the <cert> block which contains the public key that can be used to verify the signature. This field, although optional, is only optional when an agreement is in place indicating that the recipient of the document does not need the certificate in order to process the document. |
certissuer | (optional) This field contains the unique distinguished name of the issuer of the certificate. It should only be specified if the <cert> blocks are not being sent with this document. See the description of the <certissuer> field in the <cert> block for the syntax used to specify this field. |
certserial | (optional) This field contains the unique certificate serial number assigned by the issuer of the certificate. It should only be specified if the <cert> blocks are not being sent with this document. |
algorithm | (required) This string indicates the algorithm used to sign the signature block. It may be md5/rsa or sha/dsa or sha/rsa. Note: Implementors of code that is used to sign SDML electronic documents may choose to support only one of the above three possible signing algorithms. Implementors of code that is used to verify SDML electronic documents must support all three algorithms. This ensures interoperablity. The use of md5 is deprecated. |
timestamp | (optional) This field specifies the time that the document was signed. It must be in Universal time (i.e. GMT) specified as CCYYMMDDThhmmssZ, where the T and Z are literal characters, and where "CC" is the century (currently 19, soon 20), "YY" is the year, "MM" is the month, "DD" is the day, "hh" is the hour, "mm" is the minute and "ss" is the second. |
username | (optional) This is an identification string containing the certificate
user's name. It is optionally inserted into the document by the electronic hardware
token. This field, and the five following fields are optional identification data. This data is supplied by the electronic token owner to the token issuer at the time the token is initialized, but it is not certified to be correct or accurate by the token issuer. The data is inserted into the electronic token when the token is initialized, and may also be corrected or updated later by the issuer using administrative token functions and passwords. This data is then inserted, under control of the user, into the document by the electronic token, however the data cannot be changed or deleted by the user once the document is created. The user may select, when writing a document, which of the six identification fields are to be inserted into the document, in any combination, or may select none of them. |
useraddr | (optional) This is an identification string containing the certificate user's address. It is optionally inserted into the document by the electronic hardware token. |
userphone | (optional) This is an identification string containing the certificate user's phone number. It is optionally inserted into the document by the electronic hardware token. |
useremail | (optional) This is an identification string containing the certificate user's e-mail address. It is optionally inserted into the document by the electronic hardware token. |
useridnum | (optional) This is an identification string containing the certificate user's identification number. It is optionally inserted into the document by the electronic hardware token. |
userotherid | (optional) This is an identification string containing any user identification the user wishes (e.g., company name). It is optionally inserted into the document by the electronic hardware token. |
location | (optional) This field specifies location/country where the document was signed. |
sig | (required) This is a hexadecimal encoding of the actual signature data. For certain algorithms, the field is split into two portions using a colon ":". For DSA, the field contains the two portions of a DSA signature as r:s, where r and s are long hexadecimal strings. For RSA, only a single hex number is specified, with no colon separator. All signatures are encoded in "network byte order," which means that the most significant bytes are leftmost (first). |
4.3.2.2 Signature Calculation
Calculation of the signature is performed as follows. If an electronic token is being used, then all of the following steps must be performed by that token.
1. | The <nonce> value is created as a random number. The nonce value can be any string of random ASCII characters from within the set of allowed SDML characters (see Character Encoding above) not including white space. It is therefore possible for the value to be represented as an integer (containing only the digits from 0-9), a floating point number, a hexadecimal string, or a base64-encoded string. |
2. | The <nonce> value is logically prepended to the subject block contents before hashing. This includes the tag string "<nonce>," e.g., if the nonce value is 12345, the characters <nonce>12345 are logically prepended to the subject block before hashing. |
3. | The hash is calculated using the contents of the subject block, (with the <nonce> prepended) excluding the block start tag and block end tag, but including all characters in between, with the exception of all carriage returns, line feeds, and trailing spaces on a line. Leading and embedded spaces in a line are included in the hash. SGML entities (i.e., character names enclosed between an ampersand and a semicolon) are left untranslated when hashing. |
4. | The resulting hash value is inserted into the <hash> entry (as Hex ASCII) in the signature block. |
5. | Steps 2 through 4 are repeated for each block to be signed. |
6. | A second hash calculation is performed on the contents of the <sigdata> sub-block, which contains the previously calculated hashes, their block references, and the <nonce>. This should include all characters between the <sigdata> tag and the </sigdata> tag, again omitting all carriage returns, line feeds, and trailing spaces. This second hash is then encrypted using the private key. The result is the signature which is inserted (as Hex ASCII) into the signature block as the value for the <sig> tag. |
4.3.3 Certificate Block Definition
This block contains an encoded certificate.
<cert> <blkname>cccccccc-nnnnnnnn-nnnnnnnn <crit>true <vers>1.0 <certtype>x509v1 <certissuer>cccccccc <certserial>nnnnnnnn <certdata>hhhhhhhh </cert> |
4.3.3.1 Certificate Block Field Definitions
blkname | (required) The <blkname> field in a <cert> is slightly different than the "generic" <blkname>. Since the <cert> block is signed by the authority issuing the electronic token, and is probably stored in the token, it is not changeable at runtime by SDML-generating software. Thus the <blkname> chosen must be guaranteed to be unique for all subsequent documents. It is recommended (but not required) that a block naming convention be used to allow this. The recommended convention is that the name be suffixed with information that is unique to the certificate, so that the same name would never be used by other certificates in the same SDML document. As an example, a certificate issued by a bank whose Bank Routing Code is 123456789, for a customer whose account number is 987654321 might have a blockname of cert-123456789-987654321. If the certificate were for the bank itself, the blockname would be cert-123456789. |
certtype | (required) This field indicates the type of certificate contained in the block. The current possible values are x509v1 or possibly x509v3 (to be determined). |
certissuer | (required)When the <certtype> is x509v1 or x509v3, this
field contains the unique distinguished name of the issuer of the certificate. The
certificate issuer string uses the fields from the distinguished name in the ASN.1 X.509
certificate, separated by slashes, and using a TAG= identification of the name field type.
The different name fields use the following identification tags: Country C= Commonname CN= Locality L= Orgname O= Orgunit OU= State ST= Streetaddress SA= Title T= Thus, an example of an issuer string would be... /C=US/ST=New York/O=FIRSTBANK_ANYTOWN/OU=checking/ |
certserial | (required) This field contains the unique certificate serial number assigned by the issuer of the certificate. |
certdata | (required)When the <certtype> is x509v1 or x509v3, this contains the hexadecimal-encoded binary value of the ASN.1 DER encoded X.509 certificate. When using DSA signatures and keys, the p,g,q values for DSA are to be contained in the ASN.1 for the certificate of the issuer of the signer's certificate (which is stored in the token). The signer's certificate will *not* contain p,g,q values. Verification software will use the p,g,q values from the issuer's certificate when verifying a signers signature. This implies that ASN.1 parsing software will have to deal with two varieties of certificates. |
4.3.4 Attachment Block Definition
This block contains any document that is to be attached to the SDML electronic document (e.g., a remittance notice, contract, etc.).
<attachment> <crit>false <vers>1.0 <blkname>cccccccc <astatus>temporary <adata encoding="text"> ... </adata> </attachment> |
4.3.4.1 Attachment Block Field Definitions
astatus | (optional) This field indicates whether the attachment is temporary or permanent. A temporary attachment is intended to be transmitted from the originator of the document to the receiver of the document. It is stripped off the document before transmission to any third party. A permanent attachment is intended to be kept with the document permanently, including transmission to any third parties in the transaction. The contents of the field may be the word temporary or the word permanent. If the field is omitted, it defaults to temporary. Note: An SDML document is not considered invalid if it is received by a third party containing a temporary attachment; however, the document may be invalid if a recipient strips off a permanent attachment. |
adata | (required) Any data may be contained in the Attachment block, between the <adata> and </adata> tags. |
The encoding= parameter for the <adata> tag is used to specify the encoding method for the data in the sub-block. It can have the following values:
mime | If the mime encoding value is selected, then the following three
MIME headers are required to be placed in the next three lines of the <adata>
sub-block, immediately followed by a blank line: Mime-Version 1.0 Content-Type: aaaaaa/bbbbbbbb Content-Transfer-Encoding: xxxxx Any legal MIME header values may be used for aaaaaa, bbbbbbbb, or xxxxxx. In particular, if the contents of the attached document cannot be encoded using the SDML document formatting rules, described earlier, then Content-Transfer-Encoding using base64, uuencode, or quoted-printable should be used to "armor" the document against e-mail systems. In addition, the encoded document may not contain the ASCII string </adata> so that the SDML parser will not interpret any portion of the attached document as the ending SGML tag for the <adata> sub-block. The actual encoded data follows the three MIME headers, separated by a blank line (i.e. one or more spaces followed by a new line sequence, with no other non-space characters). An example of a MIME-encoded <attachment> block:
|
|
text | This allows a simple ASCII document to be inserted as an attached document without need for MIME headers or encoding/decoding software. This parameter value can only be used if the attached document inside the <adata> sub-block conforms to the SDML document formatting rules. |
4.3.5 Message Block Definition
This block contains error messages and return information that indicates the reason that the attached SDML document was not processed successfully or it may contain other information about the attached document.
<message> <blkname>cccccccc <crit>true <vers>1.0 <retcode>cccccccc <msgtext>ccccccccccccc <msgdata> ... </msgdata> </message> |
4.3.5.1 Message Block Field Definitions
retcode | (required) This field contains a return code indicating the reason why the attached document was returned. |
msgtext | (required) This field contains a textual message explaining why the document was returned. |
msgdata | (optional) This field contains any other data that may be associated with the message, e.g., a report or bank statement. |
5 Document Structure
5.1 BNF Structure of SDML electronic documents
The following is an Extended BNF description of the global block structure of an SDML electronic document.
5.1.1 BNF Meta-Notation
The meta-symbols of BNF are:
::= | meaning "is defined as" |
| | meaning "or" |
[ ] | used to enclose optional items |
{ } | used to enclose repeated items (repeated zero or more times) |
< > | used to enclose specific SDML tags. |
<( )> | used to specify SDML blocks. |
Names not enclosed in any of the above bracket symbols are called nonterminals and are used to define symbols internal to the BNF specification only.
Note: Blocks are not required to be in the exact order specified below, except that the <action> block must always appear as the first block in any <sdml-doc>.
5.1.2 BNF Definition of non-terminals
cert-chain ::= all certificates in the hierarchy leading up to, but not including the Root certificate. data-block ::= <(attachment)> | <(message)> | user-defined-block
5.1.3 BNF Definition of A Signed SDML Document
signed_doc ::= <sdml-doc> <(action)> { data-block } <(signature)> <(cert)> cert-chain </sdml-doc>
A signed SDML document consists of...
- An <sdml-doc> element
- An <action> block
- One or more data blocks (<attachment> or <message> or other blocks defined by the user of SDML)
- A <signature> block, which signs (contains the hash of) the <attachment> blocks
- The <cert> block containing the public key of the signer
- The <cert> blocks of the certificate hierarchy
- An </sdml-doc> element
5.1.4 BNF Definition of a Multiply-Signed SDML Document
multiply_signed_doc ::= <sdml-doc> <(action)> { data-block } signed_doc | multiply_signed_doc <(signature)> <(cert)> cert-chain </sdml-doc>
Thus, a multiply signed SDML document (i.e. a document which contains a nested, inner document) consists of...
- An <sdml-doc> element
- An <action> block
- One or more data blocks (<attachment> or <message> or other blocks defined by the user of SDML)
- The entire inner document
- A <signature> block, which signs (contains the hash of) the <attachment> blocks, along with the hashes of the attachments and signature in the next inner document.
- The <cert> block containing the public key of the signer
- The <cert> blocks of the certificate hierarchy
- An </sdml-doc> element
This nesting of documents may be continued indefinitely as new information is added and signed.
5.1.5 Document Structure Diagram

6 Combining Documents
As an SDML electronic document passes through the various steps and institutions that are part of the entire system that processes the document, new information may be added to the document. To allow the new information to be added, while still allowing the original information to be protected and verified using digital signatures, a document combining mechanism is defined.
To add new information to a document, the existing document is enclosed in a <sdml-doc> tag structure, which may also enclose new blocks containing the new information. New <signature> blocks may also be contained in the new information and may sign blocks in the inner nested documents. Each new, surrounding <sdml-doc> must also have a new <action> block, and TYPE parameter, and the <action> block and TYPE belonging to the outermost <sdml-doc> are used by the receiving system to determine the method used to process the modified document.
When combining original SDML documents into a larger, compound document, the names of the original blocks may not be unique. A document combining process must be used to handle naming conflicts when a number of documents are being combined (i.e., embedded) into a new document.
The document combining process is as follows:
1. | All the original <sdml-doc> elements are enclosed in a single new <sdml-doc> element. The original docname attribute parameters are kept with the same contents, unless all of the combined document names are not unique. If they are not unique, new, unique names should be assigned by the combining software. |
2. | Any time a block name reference is required to refer to a block which is
not the same <sdml-doc> as the one containing the reference (i.e. inter-document
references) then the reference consists of the DOCNAME of the <sdml-doc> element
concatenated with a period "." and then with the <blkname> of the inner
block being referred to. This is extended if the nesting is continued to more than two levels, e.g. 'outerdoc.innerdoc.block'. |
3. | Any block references inside a given <sdml-doc> must use the block name without any qualifying document name, to ensure that future document combining will not be prevented. |
As an example:
If there are two original documents:
<sdml-doc docname="doc1"> <attachment> <blkname>block1 .... </attachment> </sdml-doc> <sdml-doc docname="doc2"> <attachment> <blkname>block1 .... </attachment> </sdml-doc> |
When they are combined, the result is:
<sdml-doc docname="newdoc"> <sdml-doc docname="doc1"> <attachment> <blkname>block1 ... </attachment> </sdml-doc> <sdml-doc docname="doc2"> <attachment> <blkname>block1 ... </attachment> </sdml-doc> <signature> <blockref>doc1.block1 ... <blockref>doc2.block1 ... </signature> </sdml-doc> |
Any external references to the <attachment> block in the first document would be 'doc1.block1', and the <attachment> block in the second document would be 'doc2.block1'. References inside doc1 to any blocks in doc1 must still use the original, single-level names. Similarly for internal references inside doc2.
This is extended if the nesting is continued to more than two levels, e.g. 'outerdoc.innerdoc.block'.
7 ASN.1 Definition of X.509 Version 1 Certificates
The ASN.1 definition of an X.509 Version 1 certificate is as follows:
Certificate ::= SIGNED SEQUENCE{ version [0] Version DEFAULT 1988, serialNumber SerialNumber, signature AlgorithmIdentifier, issuer Name, validity Validity, subject Name, subjectPublicKeyInfo SubjectPublicKeyInfo} Version ::= INTEGER < 1988(0)} SerialNumber ::= INTEGER Validity ::= SEQUENCE{ notBefore UTCTime notAfter UTCTime} SubjectPublicKeyInfo ::= SEQUENCE{ algorithm AlgorithmIdentifier subjectPublicKey BIT STRING} AlgorithmIdentifier ::= SEQUENCE{ algorithm OBJECT IDENTIFIER parameters ANY DEFINED BY algorithm OPTIONAL}
The descriptions of the fields are as follows:
version | Indicates the version of X.509 which is being used. Signer certificates may be "1" to "3". |
Serial Number | A unique serial number assigned by the issuer. |
signature | An object identifier which idicates the algorithm used to sign the certificate. The location and format of the actual signature bits are defined by the SIGNED SEQUENCE data type. |
issuer | The distinguished name of the issuer of the certificate. |
validity | The Universal Coordinated Times before which the certificate is invalid and after which the certificate is invalid. The document must have been signed during the validity interval of the certificate. |
subject | The distinguished name of the signer. |
subjectPublicKeyInfo | The algorithm identifier of the subject's public key followed by the bits of the subject's public key. |
8 Field Summary
Below is a summary of the attributes of each of the entities or fields allowed in an SDML electronic document.
8.1 Field Attributes Table
Field Attribute Summary | ||||
Field Name | Containing Blocks | Min Size | Max Size | Optional |
blkname | all | 1 | 64 | |
crit | all | 4 | 5 | Yes |
vers | all | 1 | 8 | Yes |
adata | <attachment> | 1 | N/A | Yes |
algorithm | <signature> | 7 | 7 | |
astatus | <attachment> | 9 | 9 | Yes |
blockref | <signature> | 1 | 76 | |
certdata | <cert> | 1 | N/A | |
certissuer | <signature> <cert> | 1 | 256 | Yes/No |
certserial | <signature> <cert> | 1 | 16 | Yes/No |
certtype | <cert> | 6 | 6 | |
function | <action> | 1 | 16 | |
hash | <signature> | 1 | 256 | |
location | <signature> | 1 | 76 | Yes |
msgtext | <message> | 1 | 76 | |
msgdata | <message> | 1 | N/A | Yes |
nonce | <signature> | 8 | 16 | |
reason | <action> | 1 | 16 | |
retcode | <message> | 1 | 8 | |
sig | <signature> | 80 | 256 | |
sigref | <signature> | 1 | 76 | Yes |
timestamp | <signature> | 16 | 16 | Yes |
useraddr | <signature> | 1 | 76 | Yes |
useremail | <signature> | 1 | 76 | Yes |
useridnum | <signature> | 1 | 76 | Yes |
username | <signature> | 1 | 76 | Yes |
userotherid | <signature> | 1 | 76 | Yes |
userphone | <signature> | 1 | 76 | Yes |
9 Verifying Certificates
The rules for verifying certificates are:
1. | Any certificates omitted (with prior agreement) must be obtained from a local database or other means, using <certissuer> and <certserial> fields in the referring <signature> block. |
2. | Certificates may be verified by byte-wise compare against copies kept in recipient database, or cryptographically using public key of root, or via both methods. |
3. | Signature date of document must fall between not-before and not-after dates in X.509 certificate. |
4. | Issuer certificates must be checked against certificate revocation list. |
The cryptographic verification process for X.509 certificates in an SDML document is as follows. For each sub-document in the SDML document, perform the following steps:
1. | Locate all the <cert> blocks in the sub-document. Extract the <certdata> field contents, convert the hex string to binary, and parse and extract the X.509 contents. | ||||||||||||||
2. | For each certificate in the sub-document, perform the following
steps:
|
10 Bibliography
[1] ISO, International Organization for Standardization, Cast Postale 56, CH-1211, Geneva 20, Switzerland. ISO 8879 Information Processing Systems - Text and Office Systems - Standard Generalized Markup Language (SGML), 1988.[2] R. Rivest. RFC 1321 The MD5 Message-Digest Algorithm. IETF, April 1992.
[3] R.L. Rivest, A. Shamir, and L.M. Adleman. A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM , 21(2):120-126, February 1978.
[4] U.S. Department of Commerce / National Institute of Standards and Technology . FIPS Pub 180 - Secure Hash Standard, May 1993.
[5] U.S. Department of Commerce / National Institute of Standards and Technology . FIPS Pub 186 - Digital Signature Standard, May 1993.
[6] CCITT, International Telegraphic Union, General Secretariat, Place Des Nations, CH-1211, Geneva 20, Switzerland. CCITT X.509 The Directory - Authentication Framework , January 1995.
[7] Nicklaus Wirth. What can we do about the unnecessary diversity of notation for syntactic definitions. Communications of the ACM , 22(11):822-823, November 1977.
[8] John Backus and Peter Naur. The syntax and semantics of the proposed international algebraic language of the Zurich ACM-GAMM conference. Proceedings of the International Conference on Information Processing , June 1959.
[9] N. Borenstein and N. Freed. RFC 1521 MIME (Multipurpose Internet Mail Extensions) - Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies. IETF, September 1993.
[10] ISO, International Organization for Standardization, Cast Postale 56, CH-1211, Geneva 20, Switzerland. ISO 8601 Data elements and interchange formats - Information interchange - Representation of dates and times, 1988.
[11] ISO, International Organization for Standardization, Cast Postale 56, CH-1211, Geneva 20, Switzerland. ISO 8824 Information Processing Systems - Abstract Syntax Notation One (ASN 1), 1995.
[12] ISO, International Organization for Standardization, Cast Postale 56, CH-1211, Geneva 20, Switzerland. ISO 8825 Information Processing Systems - Abstract Syntax Notation One - Specification of Basic Encoding Rules (BER), Canonical Encoding Rules (CER), and Distinguished Encoding Rules (DER), 1995.
11 Issues and Directions
Certain issues are still being discussed and need further investigation. These include:
- Convergence with the new XML standard. XML, a new SDML-based standard from the W3C is a subset of SGML, as is SDML, but was developed in order to allow for more flexible documents on the World Wide Web. XML is essentially a subset of SGML. A number of issues related to the differences between XML and SGML, as well as some problems caused by the differences in goals for the two languages, will need to be resolved if SDML were to migrate to become XML-compatible.
- Use of X.509 V3 certificates, or other forms of certificates. SDML is currently "certificate-neutral" in that although it currently uses X.509 V1 certificates, by use of the <certtype> tag, other certificate types can be supported. Use of X.509 V3 certificates have a number of implications that need to be studied, and there are also reasons to investigate the use of SGML-based, "human-readable" certificates.
- SDML currently has a set of "Document Formatting Rules" which are mainly designed to allow SDML documents to pass through existing e-mail systems. As MIME becomes more widely used, these rules may become obsolete, and may be considered as optional.
Appendix A - SGML Document Type Definition (DTD)
<!SGML "ISO 8879:1986" -- -- -- DTD for SDML electronic documents -- -- First Draft 27 Feb 1996 -- -- Written by J. Kravitz IBM Research -- -- Last Revision 21 Jan 1998 -- -- Version 1.00 -- -- -- CHARSET BASESET "ISO 646:1983//CHARSET International Reference Version (IRV)//ESC 2/5 4/0" DESCSET 0 9 UNUSED 9 2 9 11 2 UNUSED 13 1 13 14 18 UNUSED 32 95 32 127 1 UNUSED BASESET "ISO Registration Number 100//CHARSET ECMA-94 Right Part of Latin Alphabet Nr. 1//ESC 2/13 4/1" DESCSET 128 32 UNUSED 160 95 32 255 1 UNUSED CAPACITY SGMLREF TOTALCAP 150000 GRPCAP 150000 SCOPE DOCUMENT SYNTAX SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 127 255 BASESET "ISO 646:1983//CHARSET International Reference Version (IRV)//ESC 2/5 4/0" DESCSET 0 128 0 FUNCTION RE 13 RS 10 SPACE 32 TAB SEPCHAR 9 NAMING LCNMSTRT "" UCNMSTRT "" LCNMCHAR "-" UCNMCHAR "-" NAMECASE GENERAL YES ENTITY NO DELIM GENERAL SGMLREF SHORTREF SGMLREF NAMES SGMLREF QUANTITY SGMLREF NAMELEN 34 TAGLVL 100 LITLEN 1024 GRPGTCNT 150 GRPCNT 64 FEATURES MINIMIZE DATATAG NO OMITTAG YES RANK NO SHORTTAG NO LINK SIMPLE NO IMPLICIT NO EXPLICIT NO OTHER CONCUR NO SUBDOC NO FORMAL YES APPINFO NONE > <!DOCTYPE sdml [ <!ELEMENT sdml o o ( sdml-doc )> <!ELEMENT sdml-doc - - ( ( action , ( sdml-doc | signature | cert | attachment | message )+ ) )> <!ATTLIST sdml-doc docname CDATA #REQUIRED type CDATA #REQUIRED > <!ELEMENT action - - ( ( blkname , crit? , vers? ),( function & reason ) )> <!ELEMENT signature - - ( ( blkname , crit? , vers? ),( sigdata , sig ) )> <!ELEMENT sigdata - - ( ( blockref , hash )+ & nonce & sigref? & (certissuer , certserial)? & algorithm & timestamp? & location? & username? & useraddr? & userphone? & useremail? & useridnum? & userotherid? )> <!ELEMENT cert - - ( ( blkname , crit? , vers? ),( certtype & (certissuer , certserial) & certdata ) )> <!ELEMENT attachment - - ( ( blkname , crit? , vers? ),( astatus? , adata )> <!ELEMENT message - - ( ( blkname , crit? , vers? ),( retcode & msgtext & msgdata? ) )> <!ELEMENT blkname - O (#PCDATA)> <!ELEMENT crit - O (#PCDATA)> <!ELEMENT vers - O (#PCDATA)> <!ELEMENT adata - - (#CDATA)> <!ATTLIST adata encoding (mime | text) text > <!ELEMENT algorithm - O (#PCDATA)> <!ELEMENT astatus - O (#PCDATA)> <!ELEMENT blockref - O (#PCDATA)> <!ELEMENT certdata - O (#PCDATA)> <!ELEMENT certissuer - O (#PCDATA)> <!ELEMENT certserial - O (#PCDATA)> <!ELEMENT certtype - O (#PCDATA)> <!ELEMENT function - O (#PCDATA)> <!ELEMENT hash - O (#PCDATA)> <!ATTLIST hash alg (md5 | sha) #REQUIRED > <!ELEMENT location - O (#PCDATA)> <!ELEMENT msgtext - O (#PCDATA)> <!ELEMENT msgdata - - (#PCDATA)> <!ELEMENT nonce - O (#PCDATA)> <!ELEMENT reason - O (#PCDATA)> <!ELEMENT retcode - O (#PCDATA)> <!ELEMENT sig - O (#PCDATA)> <!ELEMENT sigref - O (#PCDATA)> <!ELEMENT timestamp - O (#PCDATA)> <!ELEMENT useraddr - O (#PCDATA)> <!ELEMENT useremail - O (#PCDATA)> <!ELEMENT useridnum - O (#PCDATA)> <!ELEMENT username - O (#PCDATA)> <!ELEMENT userotherid - O (#PCDATA)> <!ELEMENT userphone - O (#PCDATA)> ]>
Appendix B - Definitions
Certificate | This is a piece of data containing the Public Key of a person or organization that is issued by a certificate issuer who is authenticating that the Public Key is in fact the one owned by the named person or organization. The certificate is usually digitally signed to authenticate it and prevent tampering. The certificate may be thought of as a binding between a public key, and the identification of the owner of that public key. |
Certificate Authority | This is either a piece of software, or the organization that uses the software, that issues certificates. Certificate Authorities (usually abbreviated as CA's) may issue certificates to end users, or to other CA's, allowing them, in turn, to issue certificates. The purpose of a Certificate Authority is to act as the assurance that a particular public key belongs to the person or organization identified with that public key. |
Certificate Hierarchy | This is the "chain" of certificates, each one pointing to the issuer of that certificate that can be followed to authenticate that the certificate owner, and its issuer are bona fide. |
Digital Signature | A cryptographic mechanism applied to a file, document, or other piece of data that allows the document to be authenticated as to its creator and contents. Most Digital Signature algorithms involve a Cryptographic Hash combined with a Public Key Encryption. |
Electronic Token | This is a electronic, tamper-resistant device used to perform the digital signing operation, and to hold any secret information, such a private keys, in a secure manner. Examples of this token would be Smart Cards, PCMCIA Cards, PC-bus (e.g. PCI) boards, etc. Some applications may require the use of such a token, as opposed to allowing such signing and secret keeping to be performed on a regular PC. It is not strictly required by the SDML definition that such a token be used; however, SDML is defined in such a way as to support such tokens. |
Public Key | A Public Key is a number (usually a very large number) that is mathematically related to its associated Private Key and is used in Public Key Cryptography. A Public Key can be freely published without loss of security, and, in fact, for digital signature purposes, must be widely distributed. |
Private Key | A Private Key is a number (usually a very large number) that is mathematically related to its associated Public Key and is used in Public Key Cryptography. A Private Key must be very securely hidden and not be made accessible to anyone other than the key owner. Very often, electronic means (and sometimes even explosives!) are used to protect the security of Private Keys. |
Root CA | This is the most-authoritative Certificate Authority in a Certificate Hierarchy. (This is a simplification, and assumes a tree-structured certificate hierarchy. Other, more complex, structures, with no root or multiple roots are possible). All certificates in the hierarchy can be traced back, possibly through multiple levels of CA's, to the Root CA. |
Appendix C - Acknowledgements
The creation and continued enhancement of this document was greatly assisted by many people, including the following contributors from the FSTC E-Check Technical Team:
Jim Akister | RDM |
Milt Anderson | Bellcore |
Sheueling Chang | Sun Microsystems |
Greg Dunne | Telequip |
Mark Feldman | CommerceNet |
Nikki Fischer | Huntington Banks |
John Fricke | Chase Bank |
Michael Halperin | BBN Planet |
Chris Hibbert | Agorics |
Eric Hill | Agorics |
Frank Jaffe | BankBoston |
David Lant | RDM |
An Le | National Semiconductor |
Stuart Marks | Sun Microsystems |
Cyndi Mills | BBN |
Elaine Palmer | IBM Research |
Brian Risman | Bank of Montreal |
Robert Rocchetti | Sun Microsystems |
Jim Seck | Unisys |
Mark Smith | Oak Ridge National Lab |
Sean Smith | IBM Research |
Tony Smith | Intranet |
Dave Solo | BBN |
Kurt Thams | Agorics |
Gene Tsudik | USC-ISI |
Paridhi Verma | IBM Research |
Jyri Virkki | Bellcore |
Gary Werner | Unisys |