Subject: Re: Processing two documents, which order?
From: Andrew Welch <andrew.j.welch@xxxxxxxxx>
Date: Fri, 8 Apr 2011 10:51:25 +0100
|
On 8 April 2011 10:33, Dave Pawson <davep@xxxxxxxxxxxxx> wrote:
> On Fri, 8 Apr 2011 11:16:09 +0200
> Wolfgang Laun <wolfgang.laun@xxxxxxxxx> wrote:
>
>> I've spotted a few snags:
>>
>> running this with property=bee on
>>
>> <p>a bee on a bee-line to a frisbee is not a bee</p>
>>
>> produces
>>
>> <p>a <property>bee</property> on a <property>bee</property> line to
>> a fris<property>bee</property> is not a bee</p>
>>
>> - "bee" in "bee-line" is marked up and the hyphen is lost
>> - The "bee" ending of "friesbee" is marked up
>> - The trailing "bee" is not marked up.
If possible, I'd be tempted to do 2 passes, the first to mark up
words, the second to do the comparisons.. eg
<p><w>a</w> <w>bee</w> <w>on</w> <w>a</w> <w>bee-line</w>....
Then if you see any issues you can incrementally tweak the markup,
rather than repeatedly running the whole process. The final stage
would then be really simple comparisons.
--
Andrew Welch
http://andrewjwelch.com
|