Table of contentsAppendices |
2.12 Language IdentificationLanguage IdentificationIn document processing, it is often useful to identify the natural or formal language in which the content is written. A special Attribute named xml:lang MAY be inserted in documents to specify the language used in the contents and attribute values of any element in an XML document. In valid documents, this attribute, like any other, MUST be Attribute-List Declaration if it is used. The values of the attribute are language identifiers as defined by [RFC1766], Tags for the Identification of Languages, or its successor; in addition, the empty string MAY be specified. (Productions 33 through 38 have been removed.) For example: <p xml:lang="en">The quick brown fox jumps over the lazy dog.</p> <p xml:lang="en-GB">What colour is it?</p> <p xml:lang="en-US">What color is it?</p> <sp who="Faust" desc='leise' xml:lang="de"> <l>Habe nun, ach! Philosophie,</l> <l>Juristerei, und Medizin</l> <l>und leider auch Theologie</l> <l>durchaus studiert mit heißem Bemüh'n.</l> </sp> The intent declared with xml:lang is considered to apply to all attributes and content of the element where it is specified, unless overridden with an instance of xml:lang on another element within that content. In particular, the empty value of xml:lang is used on an element B to override a specification of xml:lang on an enclosing element A, without specifying another language. Within B, it is considered that there is no language information available, just as if xml:lang had not been specified on B or any of its ancestors. NOTE: A simple declaration for xml:lang might take the form xml:lang CDATA #IMPLIED but specific default values MAY also be given, if appropriate. In a collection of French poems for English students, with glosses and notes in English, the xml:lang attribute might be declared this way: <!ATTLIST poem xml:lang CDATA 'fr'> <!ATTLIST gloss xml:lang CDATA 'en'> <!ATTLIST note xml:lang CDATA 'en'> |