Table of contents

[Home] [W3C Specs]

1 Document Object Model Core

1.1 Overview of the DOM Core Interfaces

1.1.1 The DOM Structure Model

1.1.2 Memory Management

1.1.3 Naming Conventions

1.1.4 Inheritance vs. Flattened Views of the API

1.2 Basic Types

1.2.1 The DOMString Type

1.2.2 The DOMTimeStamp Type

1.2.3 The DOMUserData Type

1.2.4 The DOMObject Type

1.3 General Considerations

1.3.1 String Comparisons in the DOM

1.3.2 DOM URIs

1.3.3 XML Namespaces

1.3.4 Base URIs

1.3.5 Mixed DOM Implementations

1.3.6 DOM Features

1.3.7 Bootstrapping

1.4 Fundamental Interfaces: Core Module

1.5 Extended Interfaces: XML Module

Appendices

A.1 New sections

A.2 Changes to DOM Level 2 Core interfaces and exceptions

A.3 New DOM features

A.4 New types

A.5 New interfaces

A.6 Objects

B Namespaces Algorithms

B.1 Namespace Normalization

B.1.1 Scope of a Binding

B.1.2 Conflicting Namespace Declaration

B.2 Namespace Prefix Lookup

B.3 Default Namespace Lookup

B.4 Namespace URI Lookup

C Infoset Mapping

C.1 Document Node Mapping

C.1.1 Infoset to Document Node

C.1.2 Document Node to Infoset

C.2 Element Node Mapping

C.2.1 Infoset to Element Node

C.2.2 Element Node to Infoset

C.3 Attr Node Mapping

C.3.1 Infoset to Attr Node

C.3.2 Attr Node to Infoset

C.4 ProcessingInstruction Node Mapping

C.4.1 Infoset to ProcessingInstruction Node

C.4.2 ProcessingInstruction Node to Infoset

C.5 EntityReference Node Mapping

C.5.1 Infoset to EntityReference Node

C.5.2 EntityReference Node to Infoset

C.6 Text and CDATASection Nodes Mapping

C.6.1 Infoset to Text Node

C.6.2 Text and CDATASection Nodes to Infoset

C.7 Comment Node Mapping

C.7.1 Infoset to Comment Node

C.7.2 Comment Node to Infoset

C.8 DocumentType Node Mapping

C.8.1 Infoset to DocumentType Node

C.8.2 DocumentType Node to Infoset

C.9 Entity Node Mapping

C.9.1 Infoset to Entity Node

C.9.2 Entity Node to Infoset

C.10 Notation Node Mapping

C.10.1 Infoset to Notation Node

C.10.2 Notation Node to Infoset

D Configuration Settings

D.1 Configuration Scenarios

E Accessing code point boundaries

E.1 Introduction

E.2 Methods

F IDL Definitions

G Java Language Binding

G.1 Java Binding Extension

G.2 Other Core interfaces

H ECMAScript Language Binding

H.1 ECMAScript Binding Extension

H.2 Other Core interfaces

I Acknowledgements

I.1 Production Systems

E.1 Introduction

Introduction

This appendix is an informative, not a normative, part of the Level 3 DOM specification.

Characters are represented in Unicode by numbers called code points (also called scalar values). These numbers can range from 0 up to 1,114,111 = 10FFFF16 (although some of these values are illegal). Each code point can be directly encoded with a 32-bit code unit. This encoding is termed UCS-4 (or UTF-32). The DOM specification, however, uses UTF-16, in which the most frequent characters (which have values less than FFFF16) are represented by a single 16-bit code unit, while characters above FFFF16 use a special pair of code units called a surrogate pair. For more information, see [Unicode] or the Unicode Web site.

While indexing by code points as opposed to code units is not common in programs, some specifications such as [XPath10] (and therefore XSLT and [XPointer]) use code point indices. For interfacing with such formats it is recommended that the programming language provide string processing methods for converting code point indices to code unit indices and back. Some languages do not provide these functions natively; for these it is recommended that the native String type that is bound to DOMString be extended to enable this conversion. An example of how such an API might look is supplied below.

NOTE:
Since these methods are supplied as an illustrative example of the type of functionality that is required, the names of the methods, exceptions, and interface may differ from those given here.

[Next Chapter] [Home]