Stylus Studio XML Editor

Table of contents

Appendices

1.3 General Considerations

General Considerations

String Comparisons in the DOM[top]

String Comparisons in the DOM

The DOM has many interfaces that imply string matching. For XML, string comparisons are case-sensitive and performed with a binary comparison of the 16-bit units of the DOMStrings. However, for case-insensitive markup languages, such as HTML 4.01 or earlier, these comparisons are case-insensitive where appropriate.

Note that HTML processors often perform specific case normalizations (canonicalization) of the markup before the DOM structures are built. This is typically using uppercase for element names and lowercase for attribute names. For this reason, applications should also compare element and attribute names returned by the DOM implementation in a case-insensitive manner.

The character normalization, i.e. transforming into their fully normalized form as as defined in [XML11], is assumed to happen at serialization time. The DOM Level 3 Load and Save module [DOMLS] provides a serialization mechanism (see the DOMSerializer interface, section 2.3.1) and uses the DOMConfiguration parameters "normalize-characters" and "check-character-normalization" to assure that text is fully normalized [XML11]. Other serialization mechanisms built on top of the DOM Level 3 Core also have to assure that text is fully normalized.

DOM URIs[top]

DOM URIs

The DOM specification relies on DOMString values as resource identifiers, such that the following conditions are met:

  1. An absolute identifier absolutely identifies a resource on the Web;

  2. Simple string equality establishes equality of absolute resource identifiers, and no other equivalence of resource identifiers is considered significant to the DOM specification;

  3. A relative identifier is easily detected and made absolute relative to an absolute identifier;

  4. Retrieval of content of a resource may be accomplished where required.

The term "absolute URI" refers to a complete resource identifier and the term "relative URI" refers to an incomplete resource identifier.

Within the DOM specifications, these identifiers are called URIs, "Uniform Resource Identifiers", but this is meant abstractly. The DOM implementation does not necessarily process its URIs according to the URI specification [URIRef]. Generally the particular form of these identifiers must be ignored.

When is not possible to completely ignore the type of a DOM URI, either because a relative identifier must be made absolute or because content must be retrieved, the DOM implementation must at least support identifier types appropriate to the content being processed. [HTML40], [XML], and associated namespace specification [Namespaces] rely on [URIRef] to determine permissible characters and resolving relative URIs. Other specifications such as namespaces in XML 1.1 [Namespaces11] may rely on alternative resource identifier types that may, for example, include non-ASCII characters, necessitating support for alternative resource identifier types where required by applicable specifications.

XML Namespaces[top]

XML Namespaces

DOM Level 2 and 3 support XML namespaces [Namespaces] by augmenting several interfaces of the DOM Level 1 Core to allow creating and manipulating elements and attributes associated to a namespace. When [XML11] is in use (see Document.xmlVersion), DOM Level 3 also supports [Namespaces11].

As far as the DOM is concerned, special attributes used for declaring XML namespaces are still exposed and can be manipulated just like any other attribute. However, nodes are permanently bound to namespace URIs as they get created. Consequently, moving a node within a document, using the DOM, in no case results in a change of its namespace prefix or namespace URI. Similarly, creating a node with a namespace prefix and namespace URI, or changing the namespace prefix of a node, does not result in any addition, removal, or modification of any special attributes for declaring the appropriate XML namespaces. Namespace validation is not enforced; the DOM application is responsible. In particular, since the mapping between prefixes and namespace URIs is not enforced, in general, the resulting document cannot be serialized naively. For example, applications may have to declare every namespace in use when serializing a document.

In general, the DOM implementation (and higher) doesn't perform any URI normalization or canonicalization. The URIs given to the DOM are assumed to be valid (e.g., characters such as white spaces are properly escaped), and no lexical checking is performed. Absolute URI references are treated as strings and compared literally. How relative namespace URI references are treated is undefined. To ensure interoperability only absolute namespace URI references (i.e., URI references beginning with a scheme name and a colon) should be used. Applications should use the value null as the namespaceURI parameter for methods if they wish to have no namespace. In programming languages where empty strings can be differentiated from null, empty strings, when given as a namespace URI, are converted to null. This is true even though the DOM does no lexical checking of URIs.

NOTE: 

Element.setAttributeNS(null, ...) puts the attribute in the per-element-type partitions as defined in [XML Namespace Partitions] in [Namespaces].

NOTE: 

In the DOM, all namespace declaration attributes are by definition bound to the namespace URI: "http://www.w3.org/2000/xmlns/". These are the attributes whose namespace prefix or qualified name is "xmlns" as introduced in [Namespaces11].

In a document with no namespaces, the child list of an EntityReference node is always the same as that of the corresponding Entity. This is not true in a document where an entity contains unbound namespace prefixes. In such a case, the descendants of the corresponding EntityReference nodes may be bound to different namespace URIs, depending on where the entity references are. Also, because, in the DOM, nodes always remain bound to the same namespace URI, moving such EntityReference nodes can lead to documents that cannot be serialized. This is also true when the DOM Level 1 method Document.createEntityReference(name) is used to create entity references that correspond to such entities, since the descendants of the returned EntityReference are unbound. While DOM Level 3 does have support for the resolution of namespace prefixes, use of such entities and entity references should be avoided or used with extreme care.

The "NS" methods, such as Document.createElementNS(namespaceURI, qualifiedName) and Document.createAttributeNS(namespaceURI, qualifiedName), are meant to be used by namespace aware applications. Simple applications that do not use namespaces can use the DOM Level 1 methods, such as Document.createElement(tagName) and Document.createAttribute(name). Elements and attributes created in this way do not have any namespace prefix, namespace URI, or local name.

NOTE: 

DOM Level 1 methods are namespace ignorant. Therefore, while it is safe to use these methods when not dealing with namespaces, using them and the new ones at the same time should be avoided. DOM Level 1 methods solely identify attribute nodes by their Node.nodeName. On the contrary, the DOM Level 2 methods related to namespaces, identify attribute nodes by their Node.namespaceURI and Node.localName. Because of this fundamental difference, mixing both sets of methods can lead to unpredictable results. In particular, using Element.setAttributeNS(namespaceURI, qualifiedName, value), an element may have two attributes (or more) that have the same Node.nodeName, but different Node.namespaceURIs. Calling Element.getAttribute(name) with that nodeName could then return any of those attributes. The result depends on the implementation. Similarly, using Element.setAttributeNode(newAttr), one can set two attributes (or more) that have different Node.nodeNames but the same Node.prefix and Node.namespaceURI. In this case Element.getAttributeNodeNS(namespaceURI, localName) will return either attribute, in an implementation dependent manner. The only guarantee in such cases is that all methods that access a named item by its nodeName will access the same item, and all methods which access a node by its URI and local name will access the same node. For instance, Element.setAttribute(name, value) and Element.setAttributeNS(namespaceURI, qualifiedName, value) affect the node that Element.getAttribute(name) and Element.getAttributeNS(namespaceURI, localName), respectively, return.

Base URIs[top]

Base URIs

The DOM Level 3 adds support for the [base URI] property defined in [InfoSet] by providing a new attribute on the Node interface that exposes this information. However, unlike the Node.namespaceURI attribute, the Node.baseURI attribute is not a static piece of information that every node carries. Instead, it is a value that is dynamically computed according to [XMLBase]. This means its value depends on the location of the node in the tree and moving the node from one place to another in the tree may affect its value. Other changes, such as adding or changing an xml:base attribute on the node being queried or one of its ancestors may also affect its value.

One consequence of this it that when external entity references are expanded while building a Document one may need to add, or change, an xml:base attribute to the Element nodes originally contained in the entity being expanded so that the Node.baseURI returns the correct value. In the case of ProcessingInstruction nodes originally contained in the entity being expanded the information is lost. [DOMLS] handles elements as described here and generates a warning in the latter case.

Mixed DOM Implementations[top]

Mixed DOM Implementations

As new XML vocabularies are developed, those defining the vocabularies are also beginning to define specialized APIs for manipulating XML instances of those vocabularies. This is usually done by extending the DOM to provide interfaces and methods that perform operations frequently needed by their users. For example, the MathML [MathML2] and SVG [SVG1] specifications have developed DOM extensions to allow users to manipulate instances of these vocabularies using semantics appropriate to images and mathematics, respectively, as well as the generic DOM XML semantics. Instances of SVG or MathML are often embedded in XML documents conforming to a different schema such as XHTML.

While the Namespaces in XML specification [Namespaces] provides a mechanism for integrating these documents at the syntax level, it has become clear that the DOM Level 2 Recommendation [DOM2Core] is not rich enough to cover all the issues that have been encountered in having these different DOM implementations be used together in a single application. DOM Level 3 deals with the requirements brought about by embedding fragments written according to a specific markup language (the embedded component) in a document where the rest of the markup is not written according to that specific markup language (the host document). It does not deal with fragments embedded by reference or linking.

A DOM implementation supporting DOM Level 3 Core should be able to collaborate with subcomponents implementing specific DOMs to assemble a compound document that can be traversed and manipulated via DOM interfaces as if it were a seamless whole.

The normal typecast operation on an object should support the interfaces expected by legacy code for a given document type. Typecasting techniques may not be adequate for selecting between multiple DOM specializations of an object which were combined at run time, because they may not all be part of the same object as defined by the binding's object model. Conflicts are most obvious with the Document object, since it is shared as owner by the rest of the document. In a homogeneous document, elements rely on the Document for specialized services and construction of specialized nodes. In a heterogeneous document, elements from different modules expect different services and APIs from the same Document object, since there can only be one owner and root of the document hierarchy.

DOM Features[top]

DOM Features

Each DOM module defines one or more features, as listed in the conformance section ([Conformance]). Features are case-insensitive and are also defined for a specific set of versions. For example, this specification defines the features "Core" and "XML", for the version "3.0". Versions "1.0" and "2.0" can also be used for features defined in the corresponding DOM Levels. To avoid possible conflicts, as a convention, names referring to features defined outside the DOM specification should be made unique. Applications could then request for features to be supported by a DOM implementation using the methods DOMImplementationSource.getDOMImplementation(features) or DOMImplementationSource.getDOMImplementationList(features), check the features supported by a DOM implementation using the method DOMImplementation.hasFeature(feature, version), or by a specific node using Node.isSupported(feature, version). Note that when using the methods that take a feature and a version as parameters, applications can use null or empty string for the version parameter if they don't wish to specify a particular version for the specified feature.

Up to the DOM Level 2 modules, all interfaces, that were an extension of existing ones, were accessible using binding-specific casting mechanisms if the feature associated to the extension was supported. For example, an instance of the EventTarget interface could be obtained from an instance of the Node interface if the feature "Events" was supported by the node.

As discussed [Mixed DOM Implementations], DOM Level 3 Core should be able to collaborate with subcomponents implementing specific DOMs. For that effect, the methods DOMImplementation.getFeature(feature, version) and Node.getFeature(feature, version) were introduced. In the case of DOMImplementation.hasFeature(feature, version) and Node.isSupported(feature, version), if a plus sign "+" is prepended to any feature name, implementations are considered in which the specified feature may not be directly castable but would require discovery through DOMImplementation.getFeature(feature, version) and Node.getFeature(feature, version). Without a plus, only features whose interfaces are directly castable are considered.

// example 1, without prepending the "+"
if (myNode.isSupported("Events", "3.0")) {
    EventTarget evt = (EventTarget) myNode;
    // ...
}
// example 2, with the "+"
if (myNode.isSupported("+Events", "3.0")) {
    // (the plus sign "+" is irrelevant for the getFeature method itself
    // and is ignored by this method anyway)
    EventTarget evt = (EventTarget) myNode.getFeature("Events", "3.0");
    // ...
}

Bootstrapping[top]

Bootstrapping

Because previous versions of the DOM specification only defined a set of interfaces, applications had to rely on some implementation dependent code to start from. However, hard-coding the application to a specific implementation prevents the application from running on other implementations and from using the most-suitable implementation of the environment. At the same time, implementations may also need to load modules or perform other setup to efficiently adapt to different and sometimes mutually-exclusive feature sets.

To solve these problems this specification introduces a DOMImplementationRegistry object with a function that lets an application find implementations, based on the specific features it requires. How this object is found and what it exactly looks like is not defined here, because this cannot be done in a language-independent manner. Instead, each language binding defines its own way of doing this. See [Java Language Binding] and [ECMAScript Language Binding] for specifics.

In all cases, though, the DOMImplementationRegistry provides a getDOMImplementation method accepting a features string, which is passed to every known DOMImplementationSource until a suitable DOMImplementation is found and returned. The DOMImplementationRegistry also provides a getDOMImplementationList method accepting a features string, which is passed to every known DOMImplementationSource, and returns a list of suitable DOMImplementations. Those two methods are the same as the ones found on the DOMImplementationSource interface.

Any number of DOMImplementationSource objects can be registered. A source may return one or more DOMImplementation singletons or construct new DOMImplementation objects, depending upon whether the requested features require specialized state in the DOMImplementation object.