[Home] [By Thread] [By Date] [Recent Entries]
> > > In XML, typing specifies validation algorithms. > > > > Hmm... well it also allows software to unpack XML instances into native > > storage in a sensible way without being savvy to the internals of the > > applications that generated or will use the data. > SOAP toolkits wouldn't be anywhere near as appealing to the average > developer if it wasn't for this "feature". I agree this is an important application. (I'll refer to this as "data-binding".) > As much as I love the simplicity > and elegance of RELAX NG, it's simply not suitable for generating > language-specific bindings that can both generate and parse XML. Could you expand on this? I believe Relaxer [1] supports data-binding for RELAX and has some experimental support for RELAX NG. I can think of a couple of reasons why people might think RELAX NG unsuitable for data-binding. 1. Type assignment. The only semantics that RELAX NG defines for a schema is whether or not it matches an instance. Unlike XSD, it doesn't define a mapping between elements and attributes in the instance and particular element and attribute patterns in the schema. There are a number of reasons that the RELAX NG spec doesn't address this. (a) Although there are many applications that need type assignment, there are also many applications that don't. (b) There's no one single right way to do type assignment. You can choose to impose various different constraints on the schema to make ambiguities impossible, or you can specify rules to be used to resolve ambiguities. XSD uses a rather ad hoc mixture. It imposes constraints in the form of the element declarations consistent and unique particle attribution constraints, but for unions of simple types, it resolves ambiguities by preferring the first match. (c) Type assignment has a cost. If you impose constraints, then those constraints have to specified (which increases the complexity of the spec) and enforced (which increases the complexity of the implementation); users have to learn them (which decreases ease of learning); users have to write their schemas so as to satisfy the constraints (which decreases ease of use) and certain things are no longer expressible. These considerations lead me to the conclusion that it is better to deal with type assignment in a modular way as a separate specification. I believe this is possible and practical. Concretely, I envisage having a "RELAX NG Type Assignment" spec that standardizes one or more annotation attributes that can assert that the schema satisfies particular constraints which facilitate type assignment. A processor that implemented this spec would be able to report whether or not that the schema satisfied the asserted constraint. Dealing with type assignment in this way allows those who do not care about type assignment not have to pay for it. It also allows there to be multiple ways of doing type assignment that make different tradeoffs between expressiveness and runtime performance. 2. Inheritance. XSD provides a complex type hierarchy whereas RELAX NG doesn't. Now, using inheritance in XSD isn't compulsory. I can write schemas using model groups and attribute groups without any use of restriction or extension (except for the trivial, implicit extension of the ur-type). If XSD data-binding implementations can handle such schemas which don't make any use of inheritance, the absence of a complex type hierarchy cannot be an insuperable barrier to data-binding implementation. It seems to me that the benefit of having the complex type hierarchy is that it allows a data-binding implementation produce better, more natural class definitions. It also seems to me that any production data-binding implementation is going to want to provide a way for the user to control the way the schema gets mapped into classes, for example, to control the package name, the class name or the way non-trivial content models get handled. Typically, I guess annotations would be used for this. So, here's my question: why can't annotations be used in the same way to allow the user to control the inheritance hierarchy in the generated classes. It doesn't seem to me that such annotations need to be very complicated; for example, you could have an annotation on <ref> that said this <ref> was a reference to a base class. As in the type-assignment case, if there's a need, such annotations can be standardized in a separate spec. In conclusion, my view is that although XSD out of the box provides much more support for data-binding than RELAX NG, nonetheless RELAX NG provides a suitable basis on which to build support for data-binding. The RELAX NG approach gives a lot of flexibility and avoids imposing costs on those who do not use XML just as a serialization format for C# and Java. However, I have to admit that until such time as the kinds of annotation I mentioned above get standardized, RELAX NG provides less interoperability than XSD for data-binding. James [1] http://www.asahi-net.or.jp/~dp8t-asm/java/tools/Relaxer/
|

Cart



