[Home] [By Thread] [By Date] [Recent Entries]
I agree, my specification is likely not complete. However, my input is a
single document written by one person indexing a single journal. There is a
great deal of consistency to the data and I doubt that there are as many as
1000 names. That said:
I received an answer off the list (thus do not feel authorized to post it here) that will help me discover what oddities I have not covered. It explained the regex expressions it used so that perhaps if modification is required, I may be able to do it. Thanks for your time, Michael; as always this list provides the most consistent and practical advice around, something you all can be proud of. Mark -----Original Message----- From: Michael Kay Sent: Tuesday, November 06, 2012 2:10 AM To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Subject: Re: Inverting names with Jr and Sr considered I wouldn't even attempt to write any code based on this as the specification. For this to work at all well, you're going to need to iteratively adapt the solution to handle all the names in your dataset, or at least a sample of a couple of thousand of them. There's just too much variation in the names you might encounter. Are "Jr" and "Sr" really the only suffixes, and are they always spelt this way, or do you also get "III" and "Jnr" and "Jnr."? If I'm wrong, and the names are all regular and in the pattern you describe, then I think you can just tokenize on whitespace and do something like suffix := $tokens[last()][. = ('Jr', 'Sr')]
stem := if ($suffix) then remove($tokens, count($tokens)) else $tokens
value-of select="concat($stem[last()], ',']), remove($stem,
count($stem), if ($suffix) then concat('(', $suffix, ')') else '')"Michael Kay Saxonica On 05/11/2012 23:45, Mark wrote: This must have been done many times, so can some one show me where to find the answer?
|

Cart



