1. Why an explicit DTD? Because the receiver must be given the option to validate, rather than simply take a role where it can be dumped on. Run time or synchronous validation with respect to arbitrary DTDs/schemas is impractical. Down the road, provided the "canonical" DTD is well designed, the DTD could be taken as architectural, allowing for a well-defined set of variations. This needs a declarative feature, however (ATTLISTs in an internal subset, while part of traditional AF syntax, may not be the only way to go.) 2. Why is 'fault' at the top level? Because I think fault detection should occur as early as possible on the receiving end. Also, if the fault description mechanism needs extension or elaboration in the future, it's best to plan for this now by treating faults as a distinct message type altogether (rather than as a subset of the 'result' type.) 3. What's with 'data'? Probably overspecification. The motivation was to provide for a canonical format to handle forward references (inevitable in data graphs with cycles.) In retrospect, canonicalization is probably overkill - we should bend the rules for runtime optimizations. I think getting rid of 'data' (and allowing for tactical variance from the strict composition semantic of the element hierarchy) will need another attribute, either on 'scalar' or on the types that can contain scalars (to record the fact that direct containment is tactical rather than factual - we'll need this for the multi-reference case also, since the target will have to recorded somewhere.) 4. What is 'notice'? Something for the messaging crowd. The semantic to be supported is notification, and the assumption is that the data content of the message is application defined completely. I have the content model as ANY, but this could just as easily be #PCDATA, with the contents wrapped in . The idea is that our framework offers no services beyond a pure passthrough in this case. This is also a poor man's extension mechanism. 5. Why both 'type' and 'class' attributes? To express the very important distinction between structure and semantics. 'type' is for structural information that could affect things like in-core format and construction algorithms; 'class' is for semantic information - such as 'object class' or 'package to bless this structure into'. 6. Then what about 'name' attributes? This is for formals - names of struct members, argument lists, etc. the general semantic here is based on the Lisp-ish distinction between names and values: values exist by themselves, while names hav values bound to them. It is possible, if not likely, for more than one name to have the same value (in the sense of 'object identity') bound, so the correct representation must be one where the name in question is in markup separate from the markup use to represent the value. 7. Why an explicit 'null'? Optimization. e.g., for non-sparse arrays, 'null' can be used for fill (hence the 'count' attribute.) Also, since XML has lost the SGML distinction usually accorded to EMPTY declared content (i.e. in XML and *are* equivalent, sadly), I don't want to rule out the need to distinguish from - the latter being what otherwise I could have used to represent, while the former would be the *expected* result of a 'void' procedure. 8. Array support. Through the 'dim' and 'index' attributes. I forgot to add a 'unit' attribute to the 'array' element type, to represent the size or grnaularity of the atomic items in the array. Interpretation is controlled by the value of the 'type' attribute. - 'linear': one-dimensional array. 'dim' is a single number to record the length. - 'multi': row-major linearization of a multi-dimensional array, 'dim' being a space separated list of numbers (the product of which is the overall length.) - 'sparse': possibly random order linearization, 'dim' gives overall size just like 'multi', while the 'index' attribute in each 'item' locates the position, as a corresponding set of coordinates. 9. Why a single 'scalar' type? Because the universe of scalar distinctions is unbounded. The trap to avoid is overspecification of ontological distinctions ("integer", "float", "string', "foo") that really don't matter in a *text* format. Effectively, the #PCDATA content of a scalar *is* always subject to a controlling notation - but messing with NOTATION declarations and attributes and the like seemed overkill here. Hence the #REQUIRED status of the 'type' attribute, as a minimal stand-in for all such considerations. We should assume that the deserialization logic "knows" about the various scalars supported. If this seems unreasonable, then the 'type' attribute should be restricted to a name token group. AN opne issue is whether a separate 'encoding' attribute is needed. 10. What's with 'map' and 'pair'? Don't want to leave Scheme-rs out!:) Actually, I think 'pair' captures a basic structural data type (association.) It may not be used - or perhaps could be used for a variant representation of hashes - but I think it belongs in the spec for completeness. Also note that ordinary Perl hashes (where keys are known to be pure strings, rather than possibly stringified representations) should use the 'struct' type for encoding. Map+pair could be used to undo stringification complications in other cases (as well as for hashes in other schemas that don't abide by the Perl-ish restriction of string keys, e.g. 'atom's for keys.) [I believe complete orthogonality - only one way to do it - is a chimera if we're trying to be as inclusive as possible in the spec.] 11. Issues not discussed yet: - headers (such as 'mustUnderstand') and meta-data in general - alternate transport(s) - digital signatures - multiple return values