1.3 General Considerations
General Considerations
String Comparisons in the DOM[top]
String Comparisons in the DOM
The DOM has many interfaces that imply string matching. For
XML, string comparisons are case-sensitive and performed with a
binary comparison of
the 16-bit units of the
DOMStrings. However, for case-insensitive markup
languages, such as HTML 4.01 or earlier, these comparisons are
case-insensitive where appropriate.
Note that HTML processors often perform specific case
normalizations (canonicalization) of the markup before the DOM
structures are built. This is typically using uppercase for
element names and lowercase
for attribute names. For this reason, applications should also
compare element and attribute names returned by the DOM
implementation in a case-insensitive manner.
The character normalization, i.e. transforming into their fully normalized form as
as defined in [XML11], is assumed to happen at
serialization time. The DOM Level 3 Load and Save module [DOMLS] provides a serialization
mechanism (see the DOMSerializer interface, section
2.3.1) and uses the DOMConfiguration parameters
"normalize-characters"
and "check-character-normalization"
to assure that text is fully normalized [XML11]. Other serialization mechanisms built on top of
the DOM Level 3 Core also have to assure that text is
fully normalized.
DOM URIs
The DOM specification relies on DOMString values as
resource identifiers, such that the following conditions are
met:
-
An absolute identifier absolutely identifies a resource on
the Web;
-
Simple string equality establishes equality of absolute
resource identifiers, and no other equivalence of resource
identifiers is considered significant to the DOM
specification;
-
A relative identifier is easily detected and made absolute
relative to an absolute identifier;
-
Retrieval of content of a resource may be accomplished where
required.
The term "absolute URI" refers to a complete
resource identifier and the term "relative URI"
refers to an incomplete resource identifier.
Within the DOM specifications, these identifiers are called
URIs, "Uniform Resource Identifiers", but this is meant
abstractly. The DOM implementation does not necessarily process
its URIs according to the URI specification [URIRef]. Generally the particular
form of these identifiers must be ignored.
When is not possible to completely ignore the type of a DOM URI,
either because a relative identifier must be made absolute or
because content must be retrieved, the DOM implementation must
at least support identifier types appropriate to the content
being processed. [HTML40],
[XML], and associated namespace
specification [Namespaces] rely
on [URIRef] to determine
permissible characters and resolving relative URIs. Other
specifications such as namespaces in XML 1.1 [Namespaces11] may rely on alternative
resource identifier types that may, for example, include
non-ASCII characters, necessitating support for alternative
resource identifier types where required by applicable
specifications.
XML Namespaces[top]
XML Namespaces
DOM Level 2 and 3 support XML namespaces [Namespaces] by augmenting several interfaces of the DOM
Level 1 Core to allow creating and manipulating elements and attributes associated to
a namespace. When [XML11] is in use (see
Document.xmlVersion), DOM Level 3 also supports
[Namespaces11].
As far as the DOM is concerned, special attributes used for declaring
XML namespaces are still
exposed and can be manipulated just like any other attribute. However,
nodes are permanently bound to namespace
URIs as they get created. Consequently, moving a node
within a document, using the DOM, in no case results in a change of its
namespace prefix or
namespace URI. Similarly, creating a node with a namespace prefix and
namespace URI, or changing the namespace prefix of a node, does not
result in any addition, removal, or modification of any special
attributes for declaring the appropriate XML namespaces. Namespace
validation is not enforced; the DOM application is responsible. In
particular, since the mapping between prefixes and namespace URIs is
not enforced, in general, the resulting document cannot be serialized
naively. For example, applications may have to declare every namespace
in use when serializing a document.
In general, the DOM implementation (and higher) doesn't perform any
URI normalization or canonicalization. The URIs given to the DOM are
assumed to be valid (e.g., characters such as white spaces are properly
escaped), and no lexical checking is performed. Absolute URI references
are treated as strings and compared
literally. How relative namespace URI references are
treated is undefined. To ensure interoperability only absolute
namespace URI references (i.e., URI references beginning with a scheme
name and a colon) should be used. Applications should use the
value null as the namespaceURI
parameter
for methods if they wish to have no namespace. In programming
languages where empty strings can be differentiated from null,
empty strings, when given as a namespace URI, are converted to
null.
This is
true even though the DOM does no lexical checking of URIs.
NOTE:
Element.setAttributeNS(null, ...) puts the attribute in
the per-element-type partitions as defined in
[XML Namespace
Partitions] in [Namespaces].
NOTE:
In the DOM, all namespace declaration attributes are by
definition bound to the namespace URI:
"http://www.w3.org/2000/xmlns/". These are the attributes
whose namespace prefix or
qualified name is
"xmlns" as introduced in [Namespaces11].
In a document with no namespaces, the
child list of an
EntityReference node is always the same as that of the
corresponding Entity. This is not true in a document where
an entity contains unbound namespace
prefixes. In such a case, the
descendants of the corresponding
EntityReference nodes may be bound to different
namespace URIs, depending on
where the entity references are. Also, because, in the DOM, nodes
always remain bound to the same namespace URI, moving such
EntityReference nodes can lead to documents that cannot be
serialized. This is also true when the DOM Level 1 method
Document.createEntityReference(name) is used to create
entity references that correspond to such
entities, since the descendants
of the returned EntityReference are unbound. While DOM Level
3 does have support for the resolution of namespace prefixes,
use of such entities and entity references should be
avoided or used with extreme care.
The "NS" methods, such as
Document.createElementNS(namespaceURI, qualifiedName) and
Document.createAttributeNS(namespaceURI, qualifiedName),
are meant to be used by namespace aware applications. Simple
applications that do not use namespaces can use the DOM Level 1
methods, such as Document.createElement(tagName) and
Document.createAttribute(name). Elements and attributes created in this
way do not have any namespace prefix, namespace URI, or local name.
NOTE:
DOM Level 1 methods are namespace ignorant. Therefore, while it is
safe to use these methods when not dealing with namespaces, using
them and the new ones at the same time should be avoided. DOM Level 1
methods solely identify attribute nodes by their
Node.nodeName. On the contrary, the DOM Level 2 methods
related to namespaces, identify attribute nodes by their
Node.namespaceURI and Node.localName. Because of this
fundamental difference, mixing both sets of methods can lead to
unpredictable results. In particular, using
Element.setAttributeNS(namespaceURI, qualifiedName, value), an
element may have two attributes
(or more) that have the same Node.nodeName, but different
Node.namespaceURIs. Calling Element.getAttribute(name) with
that nodeName could then return any of those
attributes. The result depends on the implementation. Similarly,
using Element.setAttributeNode(newAttr), one can set two attributes (or
more) that have different Node.nodeNames but the same
Node.prefix and Node.namespaceURI. In this case
Element.getAttributeNodeNS(namespaceURI, localName) will return either attribute, in an
implementation dependent manner. The only guarantee in such cases is
that all methods that access a named item by its
nodeName will access the same item, and all methods
which access a node by its URI and local name will access the same
node. For instance, Element.setAttribute(name, value) and
Element.setAttributeNS(namespaceURI, qualifiedName, value) affect the node that
Element.getAttribute(name) and
Element.getAttributeNS(namespaceURI, localName),
respectively, return.
Base URIs
The DOM Level 3 adds support for the [base URI] property
defined in
[InfoSet] by providing a new attribute on the
Node interface that exposes this information. However,
unlike the Node.namespaceURI attribute, the
Node.baseURI attribute is not a static piece of information
that every node carries. Instead, it is a value that is dynamically
computed according to [XMLBase]. This means its value
depends on the location of the node in the tree and moving the node
from one place to another in the tree may affect its value. Other
changes, such as adding or changing an xml:base attribute on the node
being queried or one of its ancestors may also affect its value.
One consequence of this it that when external entity references are
expanded while building a Document one may need to add, or
change, an xml:base attribute to the
Element nodes originally contained in the entity being
expanded so that the Node.baseURI returns the correct value. In
the case of ProcessingInstruction nodes originally
contained in the entity being expanded the information is lost.
[DOMLS] handles elements as described
here and generates a warning in the latter case.
Mixed DOM Implementations[top]
Mixed DOM Implementations
As new XML vocabularies are developed, those defining the vocabularies
are also beginning to define specialized APIs for manipulating XML
instances of those vocabularies. This is usually done by extending the
DOM to provide interfaces and methods that perform operations
frequently needed by their users. For example, the MathML [MathML2] and SVG
[SVG1] specifications have developed DOM extensions to allow users to
manipulate instances of these vocabularies using semantics appropriate
to images and mathematics, respectively, as well as the generic DOM XML
semantics. Instances of SVG or MathML are often embedded in XML
documents conforming to a different schema such as XHTML.
While the Namespaces in XML specification [Namespaces] provides a mechanism for integrating these
documents at the syntax level, it has become clear that the DOM
Level 2 Recommendation [DOM2Core] is not rich enough to cover all the issues that
have been encountered in having these different DOM
implementations be used together in a single application. DOM
Level 3 deals with the requirements brought about by embedding
fragments written according to a specific markup language (the
embedded component) in a document where the rest of the markup
is not written according to that specific markup language (the
host document). It does not deal with fragments embedded by
reference or linking.
A DOM implementation supporting DOM Level 3 Core should be able to
collaborate with subcomponents implementing specific DOMs to assemble a
compound document that can be traversed and manipulated via DOM
interfaces as if it were a seamless whole.
The normal typecast operation on an object should support the
interfaces expected by legacy code for a given document type.
Typecasting techniques may not be adequate for selecting between
multiple DOM specializations of an object which were combined at run
time, because they may not all be part of the same object as defined by
the binding's object model. Conflicts are most obvious with the
Document object, since it is shared as owner by the rest
of the document. In a homogeneous document, elements rely on the
Document for specialized services and construction of specialized
nodes. In a heterogeneous document, elements from different modules
expect different services and APIs from the same Document
object, since there can only be one owner and root of the document
hierarchy.
DOM Features[top]
DOM Features
Each DOM module defines one or more features, as listed in the
conformance section ([Conformance]). Features
are case-insensitive and are also defined for a specific set of
versions. For example, this specification defines the features
"Core" and "XML", for the
version "3.0". Versions "1.0" and
"2.0" can also be used for features defined in the corresponding DOM
Levels. To avoid possible conflicts, as a convention, names
referring to features defined outside the DOM specification
should be made unique. Applications could then request for
features to be supported by a DOM implementation using the
methods
DOMImplementationSource.getDOMImplementation(features)
or
DOMImplementationSource.getDOMImplementationList(features),
check the features supported by a DOM implementation using the
method DOMImplementation.hasFeature(feature, version), or by a specific node using
Node.isSupported(feature, version). Note that when
using the methods that take a feature and a version as
parameters, applications can use null or empty
string for the version parameter if they don't wish to specify a
particular version for the specified feature.
Up to the DOM Level 2 modules, all interfaces, that were an
extension of existing ones, were accessible using
binding-specific casting mechanisms if the feature associated to
the extension was supported. For example, an instance of the
EventTarget interface could be obtained from an
instance of the Node interface if the feature
"Events" was supported by the node.
As discussed [Mixed DOM Implementations], DOM Level 3 Core
should be able to collaborate with subcomponents implementing
specific DOMs. For that effect, the methods
DOMImplementation.getFeature(feature, version) and
Node.getFeature(feature, version) were
introduced. In the case of
DOMImplementation.hasFeature(feature, version) and
Node.isSupported(feature, version), if a plus sign
"+" is prepended to any feature name, implementations are
considered in which the specified feature may not be directly
castable but would require discovery through
DOMImplementation.getFeature(feature, version) and
Node.getFeature(feature, version). Without a plus,
only features whose interfaces are directly castable are
considered.
// example 1, without prepending the "+"
if (myNode.isSupported("Events", "3.0")) {
EventTarget evt = (EventTarget) myNode;
// ...
}
// example 2, with the "+"
if (myNode.isSupported("+Events", "3.0")) {
// (the plus sign "+" is irrelevant for the getFeature method itself
// and is ignored by this method anyway)
EventTarget evt = (EventTarget) myNode.getFeature("Events", "3.0");
// ...
}
Bootstrapping[top]
Bootstrapping
Because previous versions of the DOM specification only defined a set
of interfaces, applications had to rely on some implementation
dependent code to start from. However, hard-coding the application to a
specific implementation prevents the application from running on other
implementations and from using the most-suitable implementation of the
environment. At the same time, implementations may also need to load
modules or perform other setup to efficiently adapt to different and
sometimes mutually-exclusive feature sets.
To solve these problems this specification introduces a
DOMImplementationRegistry object with a function that lets
an application find implementations, based on the specific features
it requires. How this object is found and what it exactly looks like is
not defined here, because this cannot be done in a language-independent
manner. Instead, each language binding defines its own way of doing
this. See [Java Language Binding] and
[ECMAScript Language Binding] for specifics.
In all cases, though, the DOMImplementationRegistry
provides a getDOMImplementation method accepting a
features string, which is passed to every known
DOMImplementationSource until a suitable
DOMImplementation is found and returned.
The DOMImplementationRegistry
also provides a getDOMImplementationList method accepting a
features string, which is passed to every known
DOMImplementationSource, and returns a list of suitable
DOMImplementations. Those two methods are
the same as the ones found on the DOMImplementationSource
interface.
Any number of DOMImplementationSource objects can be
registered. A source may return one or more
DOMImplementation singletons or construct new
DOMImplementation objects, depending upon whether the
requested features require specialized state in the
DOMImplementation object.
|