Subject: Re: Unicode Search/Replace
From: David Carlisle <davidc@xxxxxxxxx>
Date: Wed, 14 May 2008 13:12:44 +0100
|
> below for search/replace all the Unicode entities
your code does not appear to use any entity references.
[#x....] seems to be a private notation you are using to denote unicode
characters, but to XML it is just a string of 8 characters.
&#x....; is a numeric hexadecimal character reference, not an entity
reference.
Your instance document does contain an entity reference ‐
but you don't show what definition you use for that. if you use the
definitions from
http://www.w3.org/2003/entities/2007/isonum.ent
then the definition is
<!ENTITY hyphen "‐" >
in which case using ‐ in the source will look to XSLt just as if
you'd used the hyphen character directly.
> I am getting the output like xxx xxx [#x002d]
do your entity files define the entities in this [] format? if so then
the simplest is just to get a new set, that format was an informal
convention sometimes used for SGML SDATA entities in which the
replacement text was _never used_ as the system always replaced the
entity by a known character (that's what the S meant in SDATA) XML
doesn't have SDATA entities so you need a entity definition that gives a
real replacementm such as teh file I link to above.
If your hyphen entity is defined to be [#x02010] then using hyphen is
exactly teh same as using [#x02010] and the way to replace it is the
same you can't search for a single character, you need to search for
that 9 character string, eg replace(.,'[#x02010]','-') would make this
into an ASCII hyphen, but by far the simplest thing is to use a set of
entity definitions that define these characters with the unicode values
that you want.
David
________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs.
________________________________________________________________________
|