Stylus Studio XML Editor

Table of contents

Appendices

2.12 Language Identification

Language Identification

In document processing, it is often useful to identify the natural or formal language in which the content is written. A special Attribute named xml:lang MAY be inserted in documents to specify the language used in the contents and attribute values of any element in an XML document. In valid documents, this attribute, like any other, MUST be Attribute-List Declaration if it is used. The values of the attribute are language identifiers as defined by [RFC1766], Tags for the Identification of Languages, or its successor; in addition, the empty string MAY be specified.

(Productions 33 through 38 have been removed.)

For example:

<p xml:lang="en">The quick brown fox jumps over the lazy dog.</p>
<p xml:lang="en-GB">What colour is it?</p>
<p xml:lang="en-US">What color is it?</p>
<sp who="Faust" desc='leise' xml:lang="de">
<l>Habe nun, ach! Philosophie,</l>
<l>Juristerei, und Medizin</l>
<l>und leider auch Theologie</l>
<l>durchaus studiert mit hei&#xDF;em Bem&#xFC;h'n.</l>
</sp>

The intent declared with xml:lang is considered to apply to all attributes and content of the element where it is specified, unless overridden with an instance of xml:lang on another element within that content. In particular, the empty value of xml:lang is used on an element B to override a specification of xml:lang on an enclosing element A, without specifying another language. Within B, it is considered that there is no language information available, just as if xml:lang had not been specified on B or any of its ancestors.

NOTE: 

Language information may also be provided by external transport protocols (e.g. HTTP or MIME). When available, this information may be used by XML applications, but the more local information provided by xml:lang should be considered to override it.

A simple declaration for xml:lang might take the form

xml:lang CDATA #IMPLIED

but specific default values MAY also be given, if appropriate. In a collection of French poems for English students, with glosses and notes in English, the xml:lang attribute might be declared this way:

<!ATTLIST poem   xml:lang CDATA 'fr'>
<!ATTLIST gloss  xml:lang CDATA 'en'>
<!ATTLIST note   xml:lang CDATA 'en'>