- From: Mike.Champion@S...
- To: xml-dev@l...
- Date: Thu, 29 Mar 2001 09:04:59 -0500
Title:
> -----Original Message----- >
From: John Aldridge [mailto:john.aldridge@i...] > Sent: Thursday, March 29, 2001 5:35 AM > To:
xml-dev > Subject: RE: Syntax Sugar and XML information
models
> (b) Editors all write a
_standard_ normal form (i.e. not just > a normal form of their own
choosing)
This is more or less what I was hoping we could collectively
define, and "standard normal form" sounds a lot better than "Syntax Sugar
Information Set." And to answer Rick Jelliffe's question, I agree that
the W3C InfoSet is a reasonable model for what people care about when
navigating or transforming a document, but we need a richer model for editors
and databases. These are two halves of the same coin, since a database must
round-trip whatever is significant to an editor, and an editor must preserve
whatever is significant to a database). BUT I'm not sure I agree "that means you are *not* interested in the information set
of the document, but the actual text of the document's entities.
That is a fine thing. Let there be element-based (infoset) editors and
entity-based (tag-aware) editors". Databases (and arguably editors)
*should* be interested in the information set of a document rather than just the
bytes that make it up, but they need a richer information set than the W3C
InfoSet.
I'm hoping to find a middle ground between "editors and
databases must simply round-trip the (core) infoset" and "editors and
databases must round-trip every single character". My first cut at this
is that the "standard normal form" is Canonical XML + external entity
references + CDATA sections ... I'm sure there is more.
As for the order of
attributes, doesn't XML 1.0 specifically declare this to
be insignificant?
|