Re: [xsl] Character 150 withs Windows-1252 output

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

Subject: Re: Character 150 withs Windows-1252 output
From: "andrew welch" <andrew.j.welch@xxxxxxxxx>
Date: Fri, 21 Apr 2006 14:21:48 +0100

> > Gives this result:
> >
> > <foo>&#150;&#8211;</foo>
> >
> > I've checked the input file with a hex editor to make sure the
> > un-escaped dash really is 0x96.  Somehow the two characters are
> > treated differently, which is something I didn't expect.
> >
> > I think that 0x96 in the input XML read using Windows-1252 should
> > become #8211 when output using any encoding other than Windows-1252,
> > which is what is happening for the actual character 0x96, but the
> > character reference #150 gets serialised back as #150...
>
> Isn't this beause &#150; is a unicode entity? It's not a windows-1252
> entity. In other words a character entity never changes according to
> the input encoding.

Ahh of course, that makes sense.  The character for #150 is worked out
after the bytes in the document have be parsed using the encoding
specified in the prolog....

So 0x96 becomes #8211 though the mapping defined in Windows-1252, and
#150 remains as #150 because its a character reference and character
references are always unicode.

Thanks Nic!

Current Thread
Re: Character 150 withs Windows-1252 output, (continued) andrew welch - 20 Apr 2006 20:52:34 -0000 Michael Kay - 21 Apr 2006 10:42:00 -0000 andrew welch - 21 Apr 2006 12:56:34 -0000 Nic - 21 Apr 2006 13:11:36 -0000 andrew welch - 21 Apr 2006 13:22:04 -0000 <=

<- Previous	Index	Next ->
Re: Character 150 withs Windo, Nic	Thread	Regular expression /s whitesp, Karen McAdams
SV: xsl-list Digest 21 Apr 20, Lisa.Bergqvist	Date	RE: Re: Character 150 withs W, Michael Kay
	Month

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >