1.5 Extended Interfaces: XML Module
Extended Interfaces: XML Module
The interfaces defined here form part of the DOM Core specification, but
objects that expose these interfaces will never be encountered in a DOM
implementation that deals only with HTML.
The interfaces found within this section are not mandatory. A DOM
application may use the
DOMImplementation.hasFeature(feature, version) method
with parameter values "XML" and "3.0" (respectively) to determine
whether or not this module is supported by the implementation. In
order to fully support this module, an implementation must also
support the "Core" feature defined in [Fundamental Interfaces: Core Module]
and the feature "XMLVersion" with version "1.0" defined in
Document.xmlVersion. Please refer to additional
information about [Conformance] in this
specification. The DOM Level 3 XML module is backward compatible
with the DOM Level 2 XML [DOM2Core] and DOM Level 1 XML [DOM-Level-1] modules, i.e. a DOM Level 3 XML implementation
who returns true for "XML" with the
version number "3.0" must also return
true for this feature when the
version number is "2.0",
"1.0", "" or, null.
CDATA sections are used to escape blocks of text containing characters
that would otherwise be regarded as markup. The only delimiter that is
recognized in a CDATA section is the "]]>" string that ends the CDATA
section. CDATA sections cannot be nested. Their primary purpose is for
including material such as XML fragments, without needing to escape all
the delimiters.
The CharacterData.data attribute
holds the text that is contained by the CDATA
section. Note that this may contain characters that need to
be escaped outside of CDATA sections and that, depending on the character
encoding ("charset") chosen for serialization, it may be impossible to
write out some characters as part of a CDATA section.
The CDATASection interface inherits from the
CharacterData interface through the Text
interface. Adjacent CDATASection nodes are not merged by use
of the normalize method of the Node
interface.
No lexical check is done on the content of a CDATA section and it
is therefore possible to have the character sequence
"]]>" in the content, which is illegal in a CDATA
section per section 2.7 of [XML]. The presence of
this character sequence must generate a fatal error during
serialization or the cdata section must be splitted before the
serialization (see also the parameter
"split-cdata-sections" in the
DOMConfiguration interface).
NOTE:
Because no markup is recognized within a CDATASection,
character numeric references cannot be used as an escape mechanism
when serializing. Therefore, action needs to be taken when serializing
a CDATASection with a character encoding where some of
the contained characters cannot be represented. Failure to do so would
not produce well-formed XML.
One potential solution in the serialization process is to end the
CDATA section before the character, output the character using a
character reference or entity reference, and open a new CDATA section
for any further characters in the text node. Note, however, that some
code conversion libraries at the time of writing do not return an
error or exception when a character is missing from the encoding,
making the task of ensuring that data is not corrupted on serialization
more difficult.
Each Document has a doctype attribute whose
value is either null or a DocumentType
object. The DocumentType interface in the DOM Core provides
an interface to the list of entities that are defined for the document,
and little else because the effect of namespaces and the various XML
schema efforts on DTD representation are not clearly understood as of
this writing.
DOM Level 3 doesn't support editing DocumentType
nodes. DocumentType nodes are read-only.
The name of DTD; i.e., the name immediately following the
DOCTYPE keyword.
A NamedNodeMap containing the general entities, both
external and internal, declared in the DTD. Parameter entities are not
contained. Duplicates are discarded. For example in:
<!DOCTYPE ex SYSTEM "ex.dtd" [
<!ENTITY foo "foo">
<!ENTITY bar "bar">
<!ENTITY bar "bar2">
<!ENTITY % baz "baz">
]>
<ex/>
the interface provides access to foo and the first
declaration of bar but not the second declaration of
bar or baz. Every node in this map also
implements the Entity interface.
The DOM Level 2 does not support editing entities, therefore
entities cannot be altered in any way.
A NamedNodeMap containing the notations declared in the
DTD. Duplicates are discarded. Every node in this map also implements
the Notation interface.
The DOM Level 2 does not support editing notations, therefore
notations cannot be altered in any way.
The public identifier of the external subset.
The system identifier of the external subset. This may be an absolute
URI or not.
The internal subset as a string, or null if there is
none. This is does not contain the delimiting square brackets.
NOTE:
The actual content returned depends on how much information is
available to the implementation. This may vary depending on various
parameters, including the XML processor used to build the
document.
This interface represents a notation declared in the DTD. A notation
either declares, by name, the format of an unparsed entity (see [section 4.7]
of the XML 1.0 specification [XML]), or is used for formal
declaration of
processing instruction targets (see [section 2.6] of the XML 1.0
specification [XML]). The nodeName attribute
inherited from
Node is set to the declared name of the notation.
The DOM Core does not support editing Notation
nodes; they are therefore
readonly.
A Notation node does not have any parent.
The public identifier of this notation. If the
public identifier was not specified, this is null.
The system identifier of this notation. If the system identifier
was not specified, this is null. This may be an absolute
URI or not.
This interface represents a known entity, either parsed or unparsed, in an
XML document. Note that this models the entity itself not
the entity declaration.
The nodeName attribute that is inherited from
Node contains the name of the entity.
An XML processor may choose to completely expand entities before the
structure model is passed to the DOM; in this case there will be no
EntityReference nodes in the document tree.
XML does not mandate that a non-validating XML processor read and
process entity declarations made in the external subset or declared in
parameter entities. This means that parsed entities declared in
the external subset need not be expanded by some classes of applications,
and that the replacement text of the entity may not be available. When the
[replacement text] is
available, the corresponding Entity node's child list
represents the structure of that replacement value. Otherwise, the child
list is empty.
DOM Level 3 does not support editing Entity nodes; if a
user wants to make changes to the contents of an Entity,
every related EntityReference node has to be replaced in the
structure model by a clone of the Entity's contents, and
then the desired changes must be made to each of those clones
instead. Entity nodes and all their
descendants are
readonly.
An Entity node does not have any parent.
NOTE:
If the entity contains an unbound
namespace prefix, the
namespaceURI of the corresponding node in the
Entity node subtree is null. The same is
true for EntityReference nodes that refer to this entity,
when they are created using the createEntityReference
method of the Document interface.
The public identifier associated with the entity if specified, and
null otherwise.
The system identifier associated with the entity if specified, and
null otherwise. This may be an absolute URI or not.
For unparsed entities, the name of the notation for the entity. For
parsed entities, this is null.
An attribute specifying the encoding used for this entity at
the time of parsing, when it is
an external parsed entity. This is null if it an
entity from the internal subset or if it is not known.
An attribute specifying, as part of the text declaration, the encoding
of this entity, when it is an external parsed entity. This is
null otherwise.
An attribute specifying, as part of the text declaration, the version
number of this entity, when it is an external parsed entity. This is
null otherwise.
EntityReference nodes may be used to represent an entity
reference in the tree. Note that character references
and references to predefined entities are considered to be expanded by
the HTML or XML processor so that characters are represented by their
Unicode equivalent rather than by an entity reference. Moreover, the XML
processor may completely expand references to entities while building the
Document, instead of providing EntityReference
nodes. If it does provide such nodes, then for an
EntityReference node that represents a reference to a known
entity an Entity exists, and the subtree of the
EntityReference node is a copy of the
Entity node subtree. However, the latter may not be true
when an entity contains an unbound namespace prefix. In such a case, because the namespace prefix
resolution depends on where the entity reference is, the
descendants of the
EntityReference node may be bound to different
namespace URIs. When an
EntityReference node represents a reference to an unknown
entity, the node has no children and its
replacement value, when used by Attr.value for example,
is empty.
As for Entity nodes, EntityReference nodes and
all their descendants are
readonly.
NOTE:
EntityReference nodes may cause element content and
attribute value normalization problems when, such as in XML 1.0 and
XML Schema, the normalization is performed after entity reference
are expanded.
The ProcessingInstruction interface represents a
"processing instruction", used in XML as a way to keep
processor-specific information in the text of the document.
No lexical check is done on the content of a processing
instruction and it is therefore possible to have the character
sequence "?>" in the content, which is illegal a
processing instruction per section 2.6 of [XML]. The
presence of this character sequence must generate a fatal error
during serialization.
The target of this processing instruction. XML defines this as being
the first token following the markup
that begins the processing instruction.
The content of this processing instruction. This is from the first non
white space character after the target to the character immediately
preceding the ?>.
NO_MODIFICATION_ALLOWED_ERR: Raised when the node is readonly.
|