Subject: Re: Maintaining character entities
From: Paul Tremblay <phthenry@xxxxxxxxxxxxx>
Date: Tue, 20 May 2003 17:26:21 -0400
|
Are you sure you are viewing you result document in an editor that
supports uicode? I know I had the same problem, and I thought xsltproc
was broken. But xalan may be outputting the entities as true unicode
charactes. Your editor may be set for Latin-1 encoding, and will read
the first byte of the unicode character and produce the strange
results you posted below.
I fixed the problem when I upgraded my editor to support unicode. Once
I set the encoding to utf-8, the strange results went away.
Paul
On Tue, May 20, 2003 at 06:00:02PM +0900, Edward.Middleton@xxxxxxxxxxx wrote:
>
>
> >I've got XML documents, marked up to a DTD, and calling character entity
> >sets. When I run through the XSLT processor (xalan) to output another XML
> >file I find the entities have been converted to something different, and
> >fairly inconsistently.
> >
> >What I would like to achieve is having “ ü in my input xml, and
> >these entities still being untouched in my output. Can anyone advise how I
> >achieve this please?
> >
> >What I'm getting are (&ldquo;, &uuml;), or (ââ?¬Å? and Ã?¼), or (“
> >and ü), depending on character encoding settings and entity sets used. Am I
> >missing something?
> >
>
> “ ü are not predefined character entities.
> http://www.w3.org/TR/REC-xml#sec-predefined-ent
>
> They apear as literal text strings
>
> '&' 'l' 'd' 'q' 'u' 'o' ';'
>
> and so when searialized to XML the '&' character is replaced by '&' giving
>
> &ldquo;
>
> if you are making an HTML document and want these character entities you should specify the correct character entity and put.
>
> <xsl:output method="html" version="1.0" encoding="ISO-8859-1">
>
>
>
> Edward Middleton
>
>
> XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
--
************************
*Paul Tremblay *
*phthenry@xxxxxxxxxxxxx*
************************
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|