Stylus Studio XML Editor

Table of contents

Appendices

2.3 Simple Types

Simple Types

The purchase order schema declares several elements and attributes that have simple types. Some of these simple types, such as string and decimal, are built in to XML Schema, while others are derived from the built-in's. For example, the partNum attribute has a type called SKU (Stock Keeping Unit) that is derived from string. Both built-in simple types and their derivations can be used in all element and attribute declarations. Table 2 lists all the simple types built in to XML Schema, along with examples of the different types.

simpleTypesTable2100%built-in simple types Table 2. Simple Types Built In to XML Schema Simple Type Examples (delimited by commas) Notes
string11[string] 11Confirm this is electric 11
normalizedString11 [normalizedString] 11Confirm this is electric 11see (3)
token11[token] 11Confirm this is electric 11see (4)
base64Binary11[ base64Binary] 11GpM7 11 
hexBinary11[hexBinary] 110FB7 11 
integer11[integer] 11-126789, -1, 0, 1, 126789...-1, 0, 1, ... 11see (2)
positiveInteger11[positiveInteger] 111, 1267891, 2, ... 11see (2)
negativeInteger11[ negativeInteger] 11-126789, -1... -2, -1 11see (2)
nonNegativeInteger11[nonNegativeInteger] 110, 1, 1267890, 1, 2, ... 11see (2)
nonPositiveInteger11[nonPositiveInteger] 11-126789, -1, 0... -2, -1, 0 11see (2)
long11[long] 11-1, 12678967543233-9223372036854775808, ... -1, 0, 1, ... 9223372036854775807 11see (2)
unsignedLong11[unsignedLong] 110, 126789675432330, 1, ... 18446744073709551615 11see (2)
int11[int] 11-1, 126789675-2147483648, ... -1, 0, 1, ... 2147483647 11see (2)
unsignedInt11[unsignedInt] 110, 12678967540, 1, ...4294967295 11see (2)
short11[short] 11-1, 12678-32768, ... -1, 0, 1, ... 32767 11see (2)
unsignedShort11[unsignedShort] 110, 126780, 1, ... 65535 11see (2)
byte11[byte] 11-1, 126-128, ...-1, 0, 1, ... 127 11see (2)
unsignedByte11[unsignedByte] 110, 1260, 1, ... 255 11see (2)
decimal11[decimal] 11-1.23, 0, 123.4, 1000.00 11see (2)
float11[float] 11-INF, -1E4, -0, 0, 12.78E-2, 12, INF, NaN 11equivalent to single-precision 32-bit floating point, NaN is "not a number", see (2)
double11[double] 11-INF, -1E4, -0, 0, 12.78E-2, 12, INF, NaN 11equivalent to double-precision 64-bit floating point, see (2)
boolean11[boolean] 11true, false, 1, 0 11
duration11[duration] 11P1Y2M3DT10H30M12.3S 111 year, 2 months, 3 days, 10 hours, 30 minutes, and 12.3 seconds
dateTime11[dateTime] 111999-05-31T13:20:00.000-05:00 11May 31st 1999 at 1.20pm Eastern Standard Time which is 5 hours behind Co-Ordinated Universal Time, see (2)
date11[date] 111999-05-31 11see (2)
time11[time] 1113:20:00.000, 13:20:00.000-05:00 11see (2)
gYear11[ gYear] 111999 111999, see (2) (5)
gYearMonth11[gYearMonth] 111999-02 11the month of February 1999, regardless of the number of days, see (2) (5)
gMonth11[gMonth] 11--05-- 11May, see (2) (5)
gMonthDay11[gMonthDay] 11--05-31 11every May 31st, see (2) (5)
gDay11 [gDay] 11---31 11the 31st day, see (2) (5)
Name11[Name] 11shipTo 11XML 1.0 Name type
QName11[QName] 11po:USAddress 11XML Namespace QName
NCName11[NCName] 11USAddress 11XML Namespace NCName, i.e. a QName without the prefix and colon
anyURI11[anyURI] 11 0
11http://www.example.com/,
11http://www.example.com/doc.html#ID5
11
language11[language] 11en-GB, en-US, fr 11valid values for xml:lang as defined in XML 1.0
ID11[ID] 11 11XML 1.0 ID attribute type, see (1)
IDREF11[IDREF] 11 11XML 1.0 IDREF attribute type, see (1)
IDREFS11[IDREFS] 11 11XML 1.0 IDREFS attribute type, see (1)
ENTITY11[ENTITY] 11 11XML 1.0 ENTITY attribute type, see (1)
ENTITIES11[ENTITIES] 11 11XML 1.0 ENTITIES attribute type, see (1)
NOTATION11[NOTATION] 11 11XML 1.0 NOTATION attribute type, see (1)
NMTOKEN11[NMTOKEN] 11 0
11US,
11Brésil
11XML 1.0 NMTOKEN attribute type, see (1)
NMTOKENS11[NMTOKENS] 11 0
11US UK,
11Brésil Canada Mexique
11XML 1.0 NMTOKENS attribute type, i.e. a whitespace separated list of NMTOKEN's, see (1)
31 Notes: (1) To retain compatibility between XML Schema and XML 1.0 DTDs, the simple types ID, IDREF, IDREFS, ENTITY, ENTITIES, NOTATION, NMTOKEN, NMTOKENS should only be used in attributes. (2) A value of this type can be represented by more than one lexical format, e.g. 100 and 1.0E2 are both valid float formats representing "one hundred". However, rules have been established for this type that define a canonical lexical format, see [ XML Schema Part 2]. (3) Newline, tab and carriage-return characters in a normalizedString type are converted to space characters before schema processing. (4) As normalizedString, and adjacent space characters are collapsed to a single space character, and leading and trailing spaces are removed. (5) The "g" prefix signals time periods in the Gregorian calendercalendar.

ref7New simple types are defined by deriving them from existing simple types (built-in's and derived). In particular, we can derive a new simple type by restricting an existing simple type, in other words, the legal range of values for the new type are a subset of the existing type's range of values. We use the simpleType element to define and name the new simple type. We use the restriction element to indicate the existing (base) type, and to identify the "facets" that constrain the range of values. A complete list of facets is provided in Appendix B.

ref8Suppose we wish to create a new type of integer called myInteger whose range of values is between 10000 and 99999 (inclusive). We base our definition on the built-in simple type integer, whose range of values also includes integers less than 10000 and greater than 99999. To define myInteger, we restrict the range of the integer base type by employing two facets called minInclusive and maxInclusive:

NOTE: 

Defining myInteger, Range 10000-99999

<xsd:simpleType name="myInteger">
  <xsd:restriction base="xsd:integer">
    <xsd:minInclusive value="10000"/>
    <xsd:maxInclusive value="99999"/>
  </xsd:restriction>
</xsd:simpleType>

The example shows one particular combination of a base type and two facets used to define myInteger, but a look at the list of built-in simple types and their facets (Appendix B) should suggest other viable combinations.

ref9The purchase order schema contains another, more elaborate, example of a simple type definition. A new simple type called SKU is derived (by restriction) from the simple type string. Furthermore, we constrain the values of SKU using a facet called pattern in conjunction with the regular expression "\d{3}-[A-Z]{2}" that is read "three digits followed by a hyphen followed by two upper-case ASCII letters":

NOTE: 

Defining the Simple Type "SKU"

<xsd:simpleType name="SKU">
  <xsd:restriction base="xsd:string">
    <xsd:pattern value="\d{3}-[A-Z]{2}"/>
  </xsd:restriction>
</xsd:simpleType>

This regular expression language is described more fully in Appendix D.

ref10XML Schema defines twelvefifteen facets which are listed in Appendix B. Among these, the enumeration facet is particularly useful and it can be used to constrain the values of almost every simple type, except the boolean type. The enumeration facet limits a simple type to a set of distinct values. For example, we can use the enumeration facet to define a new simple type called USState, derived from string, whose value must be one of the standard US state abbreviations:

NOTE: 

Using the Enumeration Facet

<xsd:simpleType name="USState">
  <xsd:restriction base="xsd:string">
    <xsd:enumeration value="AK"/>
    <xsd:enumeration value="AL"/>
    <xsd:enumeration value="AR"/>
    <!-- and so on ... -->
  </xsd:restriction>
</xsd:simpleType>

USState would be a good replacement for the string type currently used in the state element declaration. By making this replacement, the legal values of a state element, i.e. the state subelements of billTo and shipTo, would be limited to one of AK, AL, AR, etc. Note that the enumeration values specified for a particular type must be unique.

List Types[top]

List Types

XML Schema has the concept of a list type, in addition to the so-called atomic types that constitute most of the types listed in Table 2. (Atomic types, list types, and the union types described in the next section are collectively called simple types.) The value of an atomic type is indivisible from XML Schema's perspective. For example, the NMTOKEN value US is indivisible in the sense that no part of US, such as the character "S", has any meaning by itself. In contrast, list types are comprised of sequences of atomic types and consequently the parts of a sequence (the "atoms") themselves are meaningful. For example, NMTOKENS is a list type, and an element of this type would be a white-space delimited list of NMTOKEN's, such as "US UK FR". XML Schema has three built-in list types, they are NMTOKENS, IDREFS, and ENTITIES.

ref45In addition to using the built-in list types, you can create new list types by derivation from existing atomic types. (You cannot create list types from existing list types, nor from complex types.) For example, to create a list of myInteger's:

NOTE: 

Creating a List of myInteger's

<xsd:simpleType name="listOfMyIntType">
  <xsd:list itemType="myInteger"/>
</xsd:simpleType>

And an element in an instance document whose content conforms to listOfMyIntType is:

NOTE: 
<listOfMyInt>20003 15037 95977 95945</listOfMyInt>

ref12Several facets can be applied to list types: length, minLength, maxLength, pattern, and enumeration. For example, to define a list of exactly six US states (SixUSStates), we first define a new list type called USStateList from USState, and then we derive SixUSStates by restricting USStateList to only six items:

NOTE: 

List Type for Six US States

<xsd:simpleType name="USStateList">
  <xsd:list itemType="USState"/>
</xsd:simpleType>

<xsd:simpleType name="SixUSStates">
  <xsd:restriction base="USStateList">
    <xsd:length value="6"/>
  </xsd:restriction>
</xsd:simpleType>

Elements whose type is SixUSStates must have six items, and each of the six items must be one of the (atomic) values of the enumerated type USState, for example:

NOTE: 
<sixStates>PA NY CA NY LA AK</sixStates>

Note that it is possible to derive a list type from the atomic type string. However, a string may contain white space, and white space delimits the items in a list type, so you should be careful using list types whose base type is string. For example, suppose we have defined a list type with a length facet equal to 3, and base type string, then the following 3 item list is legal:

NOTE: 
Asie Europe Afrique

But the following 3 "item" list is illegal:

NOTE: 
Asie Europe Amérique Latine

Even though "Amérique Latine" may exist as a single string outside of the list, when it is included in the list, the whitespace between Amérique and Latine effectively creates a fourth item, and so the latter example will not conform to the 3-item list type.

Union Types[top]

Union Types

ref46Atomic types and list types enable an element or an attribute value to be one or more instances of one atomic type. In contrast, a union type enables an element or attribute value to be one or more instances of one type drawn from the union of multiple atomic and list types. To illustrate, we create a union type for representing American states as singleton letter abbreviations or lists of numeric codes. The zipUnion union type is built from one atomic type and one list type:

NOTE: 

Union Type for Zip Codes

<xsd:simpleType name="zipUnion">
  <xsd:union memberTypes="USState listOfMyIntType"/>
</xsd:simpleType>

When we define a union type, the memberTypes attribute value is a list of all the types in the union.

Now, assuming we have declared an element called zips of type zipUnion, valid instances of the element are:

NOTE: 
<zips>CA</zips>
<zips>95630 95977 95945</zips>
<zips>AK</zips>

Two facets, pattern and enumeration, can be applied to a union type.