Subject: Re: Efficently transposing tokenized data
From: Beldaz Jalfrezi <beldazj@xxxxxxxxxxxx>
Date: Tue, 4 Nov 2008 22:53:30 -0800 (PST)
|
Dear Michael and Dimitre,
Thank you both for your prompt responses. Both
solutions are far better than what I had in mind, but the general problem
highlights for me that not all XML structures are equal. I shall have to
change things for future data.
Many thanks,
B.
> Date: Tue, 4 Nov 2008
23:44:13 -0000
> To:
> From: "Michael Kay"
> Subject: RE: Efficently
transposing tokenized data
> Message-ID:
<88158BB8B86E4502B2BD16BB48A759E0@Sealion>
>
> I can suggest several
approaches, but I don't guarantee that any of them
> will perform better than
doing the repeated (wasteful) tokenization.
>
> (1) Do a preprocessing pass
in which you split the data attribute into
> multiple elements, then proceed
"as normal".
>
> (2) Do a preprocessing pass to compute a sequence of NxM
strings in one big
> sequence, then operate by indexing into this big
sequence.
>
> (3) Write a user-defined function that calls tokenize() but
with
> saxon:memo-function="yes", so that the results of tokenizing a node are
> remembered when you tokenize the same node again.
>
> I think I would
probably go for (2) as it's simplest:
> [...]
Search 1000's of
available singles in your area at the new Yahoo!7 Dating. Get Started
http://au.dating.yahoo.com/?cid=53151&pid=1011
| Current Thread |
|
Dimitre Novatchev - 5 Nov 2008 00:20:56 -0000
Beldaz Jalfrezi - 5 Nov 2008 06:53:52 -0000 <=
|
|