Subject: Re: String hashing code
From: Robert Koberg <rob@xxxxxxxxxx>
Date: Fri, 14 Dec 2007 08:29:30 -0500
|
On Fri, 2007-12-14 at 18:36 +1100, Deborah Pickett wrote:
>
> I am processing a number of separate XML documents using an Ant <xslt>
> task, pulling out the MathML that is embedded inside them into their own
> XML files using xsl:result-document (where I render them using Batik).
> I want to make sure that the result document names don't clash, but
> because they are across several source files, generate-id() isn't going
> to suffice. There are thousands of source files, all with
> English-sounding names spread across many directories.
Use the file name/path and concat with generate-id() (or position() if
they have the same parent). Either output them to the same directory as
the source file or to some mirrored dir structure.
best,
-Rob
>
> I was thinking of hashing document-uri(/) to produce a probably-unique
> string that I can then append generate-id(.) to. I rejected
> encode-for-uri() as producing strings that are too long, and for not
> anonymizing the document uri enough. All the hashing algorithms I know
> (MD5, for instance) happen to be heavy on bitwise operations, and I feel
> dirty doing bitwise operations with arithmetic.
>
> I prefer not to escape to non-XSLT, because I am providing this as part
> of a library that needs to run on almost any XSLT 2.0 platform.
>
> Any clever ideas?
|