[Home] [By Thread] [By Date] [Recent Entries]
Hi Folks,
Consider this Spanish name: Martiñez
Instead of using the ñ character, one can use the (base) "n" character followed by a combining tilde (hex 303) character.
So that Spanish name can be equivalently expressed as: Martiñez
Here is an XML document that uses the latter form:
<?xml version="1.0" encoding="utf-8"?>
<Name>Martiñez</Name>
I wrote a stylesheet that uses the substring() function to extract the combining tilde character and onward:
<xsl:template match="/">
<Result>
<xsl:value-of select="substring(Name, 7)" />
</Result>
</xsl:template>
The output is:
<?xml version="1.0" encoding="UTF-8"?>
<Result>Þez</Result>
I checked it for well-formedness and the XML Parser says it is well-formed.
According to the book, Fonts & Encodings (p. 61, first paragraph):
... we select a substring that begins
with a combining character, this new
string will not be a valid string in
Unicode.
The value of the <Result> element is not a valid Unicode string, so how can it be a well-formed XML document?
/Roger
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |

Cart



