Start by using <xsl:sort lang="en"/> and see whether the results are
satisfactory. If not, try some other language more appropriate to the
data-set. If you want to refine it further, define a collation. For example
<xsl:sort collation="http://saxon.sf.net/collation?ignore-case=yes"/>
Information on Saxon collations is at
http://www.saxonica.com/documentation/index.html#!extensibility/config-extend
/collation/implementing-collation
Michael Kay
Saxonica
mike@xxxxxxxxxxxx
+44 (0) 118 946 5893
On 2 Apr 2015, at 01:25, Charles O'Connor charles.oconnor@xxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> Hi all,
>
> As a test for something a bit more complex, I am trying to do a simple sort
of names, some of which start with character entity references:
>
> <root>
> <author><surname>Öborn</surname></author>
> <author><surname>Jones</surname></author>
> <author><surname>Edwards</surname></author>
> <author><surname>Osgood</surname></author>
> <author><surname>Èmeraldo</surname></author>
> <author><surname>Smith</surname></author>
> </root>
>
> Using this transform:
>
> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> version="1.0">
> <xsl:output encoding="ASCII"/>
> <xsl:template match="/">
> <html>
> <body>
> <h1>Author List</h1>
> <xsl:for-each select="//author">
> <xsl:sort select="surname"/>
> <p>
> <xsl:value-of select="surname"/>
> </p>
> </xsl:for-each>
> </body>
> </html>
> </xsl:template>
> </xsl:stylesheet>
>
> I want two things, the entities to come out in hex and the sort to treat
characters with diacriticals as equivalent to same characters without
diacriticals. So, e with an acute accent should be sorted equivalently with e
without an acute accent.
>
> Using Oxygen, the sort works as intended when the transformer is Saxon
6.5.5, but the entities come out as decimal.
>
> <html>
> <body>
> <h1>Author List</h1>
> <p>Edwards</p>
> <p>Èmeraldo</p>
> <p>Jones</p>
> <p>Öborn</p>
> <p>Osgood</p>
> <p>Smith</p>
> </body>
> </html>
>
> If I change the transformer to Saxon 9.4.0.4, I get hex, but all the author
names that start with a character entity reference get stuck at the end.
>
> <html>
> <body>
> <h1>Author List</h1>
> <p>Edwards</p>
> <p>Jones</p>
> <p>Osgood</p>
> <p>Smith</p>
> <p>Èmeraldo</p>
> <p>Öborn</p>
> </body>
> </html>
>
> Like anyone else, I'd like to have my cake and eat it too. But, how?
>
> Thanks,
> Charles
|